Database re-organizing system and database

ABSTRACT

A reorganization system permitting reorganization in an operating implementation of the data storage and retrieval system invention disclosed in Japanese Patent Hei11-310096 Japanese Patent 3345628. The invention permits the reorganization of primary keys and blocks storing records by means of the provision of current location tables LC and new location tables LN and writing location table entries from LC to LN for one or multiple blocks in a given reorganization pass. Alternate-key indices may also be reorganized in the same fashion. Reorganization is performed continuously. The use of reorganization pointers to indicate during reorganization how far reorganization has advanced permits reorganization of a database without suspending the database while executing data retrieval, addition, updating and deletion by means of primary keys and alternate keys.

DETAILED DESCRIPTION OF THE INVENTION

1. Field of the Invention

The invention relates to the field of computerized database storage andretrieval systems and particularly to database systems that permit theuninterrupted and automatic reorganization of the database while thesystem is operating.

2. Description of Related Art

Conventional computerized database storage and retrieval systems havegenerally employed hierarchical indices, as described in Jeffrey D.Ullman, Deetabaesu Shisutemu no Genri [Principles of Database Systems],1st ed. (trans. Kunii et al., Nihon Konpyuuta Kyoukai, 25 May 1985, pp.45-71), Samuel Leffler et al., UNIX (Touroku Shoyhyou) 4.3BSD no Sekkeito Jissou [The Design and Implementation of UNIX® BSD 4.3] (trans. AkiraNakamura et al., Maruzen K. K., 30 Jun. 1991, pp.193-191) and Michael J.Folk et al., “Fairu Kouzou” [File Structures], bit supplement (trans. H.Kusumoto, Kyouritsu Shuppan K. K., 5 Jun. 1997, pp.169-191).

These conventional database storage and retrieval systems suffer fromsuch shortcomings as:

-   -   (1) Load deriving from the creation and maintenance of indices;    -   (2) The need for advance generation of blocks of the size that        is the maximum that is foreseen will be utilized; and    -   (3) Susceptibility, due to the hierarchical structure of the        indices, to the expansion of exclusion ranges and deadlock        resulting from modifications to a higher-order index when the        insertion or deletion of data results in the updating of an        index.

In order to resolve these shortcomings of conventional database storageand retrieval systems, the inventors have proposed a data storage andretrieval system (Japanese Patent publication number Hei11-231096 andU.S. Pat. No. 6,415,375) providing acceleration and ease of maintenancethrough the utilization of such means as the introduction of theconcepts of location tables and alternate-key tables instead ofconventional hierarchical indices, the simplification of the complexprocessing that accompanies indexing and the application of binarysearches on the tables themselves.

A simple description follows of the data storage and retrieval systemproposed by the inventors. The data storage and retrieval system of theinvention employs location tables and alternate-key tables and appliesbinary-search techniques to these tables to retrieve target records.Records are stored in storage regions of fixed length termed blocks.Location tables are reserved in contiguous regions. These contiguousregions are in logical order and may be in physically separate regions.Records are stored in blocks. Block addresses are held in location tableentries. Records are stored in blocks in the order of their primary keys(also termed unique keys in some types of databases, one example beingemployee codes in a database of employees) such that the primary key ofthe record in a block immediately prior is smaller than that of theprimary key of the record in the block immediately subsequent. Recordsare initially stored in a primary block, and since it will no longer beable to store a record in that primary block after it has become fullwhen a record is to be inserted into the block, an overflow block iscreated and linked to that primary block and records are stored in theprimary block after a part of the records is moved to the overflowblock. After that overflow block has become full and when another recordis then to be inserted, another overflow block is created and linked tothat overflow block.

This linkage does not refer to a physical linkage; rather, thisexpression is employed (here and also below) since the state in which aprimary block holds the address of a first overflow block and the firstoverflow block holds the address of a second overflow block may behandled as though the blocks were physically connected. Thus, the datastorage and retrieval system proposed by the inventors provides theadvantage that, since overflow blocks may be linked without limit,circumstances will not arise in which a record cannot be stored.

Problems Solved by the Invention

However, the following problems arise with the data storage andretrieval system proposed as this invention.

-   -   (1) When multiple overflow blocks are linked as described above,        it takes a longer time than when primary blocks alone exist to        retrieve a target record after searching location table entries        and identifying the primary block.    -   (2) As well, records are stored in blocks in the order of their        primary key values, and this is so both within an overflow        block(s) linked to a primary block and across a primary block        and an overflow block(s). Since records are thus stored in the        order of their primary key values, the insertion of a record may        require the movement of records across multiple overflow blocks        when many overflow blocks are linked, which takes longer than        when primary blocks alone exist.    -   (3) Additionally, when few record insertions occur after the        generation of an overflow block or when the deletion of data        results in fewer records in a block, the result is empty space        in that primary block or overflow block, and the empty space in        a block goes unused unless records are inserted in that block,        with the result of lower storage efficiency in storage regions.    -   (4) Similarly, entries in alternate-key tables are normally the        result of insertions and so alternate-key blocks are susceptible        to the generation of alternate-key overflow blocks, with the        attendant problem of lower alternate-key access speeds than when        alternate-key overflow blocks do not exist. This is because even        when the target alternate-key block for a target key value is        retrieved, when overflow blocks are linked to that alternate-key        block, these too must be retrieved in order to retrieve the        target entry.    -   (5) Furthermore, as described with respect to blocks, the        insertion of an entry requires storage in an alternate-key block        in the order of the alternate-key values, and when alternate-key        overflow blocks exist, that order applies likewise across        alternate-key block(s) and alternate-key overflow blocks.        Therefore, it takes longer to retrieve a target entry than when        alternate-key blocks alone exist. Additionally, the insertion of        an alternate-key entry requires the movement of entries, and it        is a drawback that when there are many alternate-key overflow        blocks, the volume moved is large.    -   (6) In addition, the storage of keys in alternate-key tables is        done, as described above, by means of insertion rather than        addition, and in the data storage and retrieval system proposed        by the inventors, the use of pre-alternate-key tables was also        considered, in addition to the absorption of the generation of        alternate-key overflow blocks, as a device to accommodate such        insertion, but it has proven problematic to eliminate overflow        blocks entirely.

However, these problems (1) through (6) described above are more markedin conventional data storage and retrieval systems that employhierarchical indices, and since in conventional data storage andretrieval systems that employ hierarchical indices the insertion of dataresults in the split of indices, these have suffered from the drawbacksof changes to index structure and slower access speeds.

In order to eliminate these drawbacks in conventional data storage andretrieval systems that employ hierarchical indices, it is necessary touse methods called regeneration or reorganization.

These methods called regeneration or reorganization work as follows. Thesystem is temporarily shut down entirely and the records (data) storedon the system all copied to other storage media. The original data(records, primary-key index and alternate-key indices) are then erasedand the records written back. Next, the primary-key indices andalternate-key indices are created. Creation of the primary-key indicesconsists of reading primary keys from all the records, creating entriesthat combine them with, for example, the addresses of the blocks inwhich records are stored, then sorting by the primary keys and creatingfirst the lowest-order index and then sequentially creating thehigher-order indices.

The creation of the alternate-key indices likewise consists of readingthe alternate keys from the records, creating entries that combine themwith, for example, the addresses of the blocks in which records arestored, then sorting by the alternate keys and creating first thelowest-level index and then sequentially creating the higher-levelindices. Where multiple kind of alternate keys exist, this proceduremust be executed for each type of alternate key. Since the operationsare performed in this order, regeneration can take such long periods oftime, as from several hours to several hundreds of hours in conventionaldata storage and retrieval systems that employ hierarchical indices,depending on such factors as the volume of data and the kind (number) ofindices, and suffers from the drawback that the system cannot be usedduring that time.

And since such bothersome methods must be performed in a number ofstages, it has been problematic to implement regeneration in an entirelyautomatic fashion that does not require human intervention, and thesemethods entail the further administrative problems entailed by systempersonnel performing regeneration through the night or on holidays, forexample.

Furthermore, regeneration cannot be performed on systems that rununinterrupted around the clock, even when access efficiency falls, thusincurring costs in the form of high-performance hardware to make up forthe unavoidable deterioration in performance that befalls them.

Such developments as the use of redundant hardware have enableduninterrupted operation, and the difficulty of uninterrupted databaseoperation is especially recently becoming a problem.

Meanwhile, although the data storage and retrieval system proposed bythe inventors is basically equivalent to conventional data storage andretrieval systems that employ hierarchical indices in terms ofregeneration, since the data storage and retrieval system proposed bythe inventors does not, unlike conventional data storage and retrievalsystems that employ hierarchical indices, make use of complicatedindices, but is comprised of location table and alternate-key tables,regeneration takes far less time to complete than it does withconventional data storage and retrieval systems that employ hierarchicalindices, but nor may the database be used during regeneration.

Specifically, regeneration in the data storage and retrieval systemproposed by the inventors consists of shutting down the operation of aprimary system, reading the records stored in the blocks, storing themon separate storage media and then recreating the location table, blocksand alternate-key tables for each type of alternate key of the primarysystem, creating location table entries while storing the records storedon the separate storage media back into newly created blocks and, aftercompleting the storage of the data, creating entries in alternate-keytables for each type of alternate key. Since this method required theshutdown of the primary system, like conventional data storage andretrieval systems, it was problematic to apply it to an uninterruptedsystem.

The present invention was developed in light of the problems discussedabove with the objective of providing a database system permittinguninterrupted and automatic reorganization of the data and the databaseswhile the system is running.

Means for Solving the Problem

In order to achieve the above objective, the present invention is asfollows.

Basis of the Invention

First of all, the data storage and retrieval system proposed by theinventors makes use of location table and alternate-key tables andretrieves target records by means of binary searches performed on thesetables. These records are stored in storage regions of fixed lengthtermed blocks. Of these blocks, primary blocks alone are managed bymeans of the location tables, and overflow blocks are managed by theprimary blocks. Both the location tables and the alternate-key tablesare flat tables lacking a hierarchical structure. These characteristicsof the data storage and retrieval system proposed by the inventors areexploited to perform reorganization of the location table and blocks.The alternate-key tables are likewise subjected to reconfiguration.

Reorganization here refers to the re-ordering of data or indices due tovariation in the structure of the data and slower data access resultingfrom data insertion, addition, modification and deletion performed onthe data stored in a system, and consists of (i) the elimination ofoverflow blocks, (ii) the elimination of fragmentation and (iii) thereservation of suitable initial storage rates. Descriptions follow of(i) through (iii) above.

(i) the elimination of overflow blocks consists of the following.Overflow blocks result from the insertion of records. When records havefilled a primary block and a further record is then to be inserted intothat primary block, the insertion cannot be performed as is. In order toallow such an insertion, an overflow block is allocated to that primaryblock, the necessary number of records moved from the primary block tothe overflow block and the object record then inserted into the originalprimary block to make the insertion of the record possible.

However, since record retrieval necessitates a greater load whenoverflow blocks exist than when primary blocks alone exist, overflowblocks must be made over into primary blocks and managed from locationtables in order to achieve faster access. Also, since records are storedin blocks in the order of their primary keys, the insertion of a recordrequires the movement of records, and the number of records movedincreases when there are many overflow blocks, resulting in efficiencyproblems.

(ii) The elimination of fragmentation consists of the following.Fragmentation consists in dispersion in storage regions. When a recordstored in a block (either a primary block or an overflow block) is nolonger required and is erased, the space it once occupied in the storageregion of the block is then empty. The storage region will remain emptyand go wasted unless there is a record to insert. As well, where anoverflow block is generated and few records are stored in that overflowblock relative to its storage capacity, that empty space will go unusedas wasted storage region unless record insertions subsequently occur.

In order to eliminate such wasted storage regions, records stored in asubsequent block are moved to the anterior block so that records aresufficiently stored in that block and so permit the elimination of wastein the use of storage regions.

(iii) the reservation of suitable initial storage rates is describedbelow. Initial storage rates are used to prevent to some extent thegeneration of overflow blocks by leaving a certain proportion of thespace in a block empty when first creating the block and writing recordsto it. When initially storing records in a block, these records may bestored at 100% of the capacity of the block, but thus storing records atthe full capacity of the block will result in the generation of anoverflow block immediately when a record is inserted. In order to avoidthis development, when first storing records in a primary block, recordsare stored up to a limit of, for example, 90% of the storage capacityand that empty space is used to store records when records aresubsequently inserted, making it possible to prevent the immediategeneration of an overflow block.

The foregoing discussion focuses on location tables, blocks and overflowblocks, but applies equally to alternate-key blocks and alternate-keyoverflow blocks.

The Invention As It Concerns Primary Keys

The invention as it concerns primary keys consists in the reorganizationof location tables and blocks, and is described below.

The invention as it concerns primary keys notes that a location tableentry holds, in addition to the number of the block that the entry thatthe entry points to and the address of that block, either both or eitherone of the minimum and maximum primary key values of the records storedin the block and in all of the overflow blocks linked to that block, asneeded.

The invention as it concerns primary keys creates a new location tablefor a current location table and sequentially transfers entries from thecurrent location table to the new location table. For the purposes ofthe present invention, this transfer may refer either to instances ofthe duplication of information as is or to instances of themodification, as needed, of a part of that information and then writingthat modified information. The present invention consists, whenperforming this sequential transfer, of delinking overflow blocks thatare linked to a primary block and adding new entries to the new locationtable, thus rendering those overflow blocks primary blocks in the newlocation table and not moving them. In this manner does the invention asit concerns primary keys eliminate overflow blocks.

Fragmentation is eliminated as follows in the invention as it concernsprimary keys. Elimination of fragmentation is implemented by identifyingthe storage rates of multiple blocks (primary blocks and overflowblocks), moving records between blocks within a set of multiple blocksand, as needed, either newly adding blocks and adding from locationtable entries or rendering unused blocks that had been in use and makingdeletions from location table entries.

Reservation of suitable initial storage rates is similar to theelimination of fragmentation and is implemented by moving records sothat the amount of space that records take up in a block is that of aprescribed initial storage rate.

To synthesize the foregoing description, one or multiple blocks areplaced under exclusion for a unit-processing interval and reorganizationis performed. This consists of performing the elimination of overflowblocks, the elimination of fragmentation and the reservation of suitableinitial storage rates together. When reorganization of the affectedblocks has completed, exclusion is lifted on them and they are madeavailable for use. Since it appears to be the processing of a singletransaction, this reorganization does not conflict with data updatingthrough regular processing.

In order to perform reorganization as per the method described above inthe invention as it concerns primary keys, reorganization is enabled torun automatically and without interrupting access to data. In thepresent invention, access to one or multiple blocks is delayed underexclusion during the unit-processing interval due to reorganization, butblocks other than these are always accessible. The state in which accessto one or multiple blocks is delayed under exclusion duringreorganization also occurs in regular record updating and is not aparticular problem. Reorganization pointers are used in the invention asit concerns primary keys in order to access records duringreorganization. One reorganization pointer each is provided for thecurrent location table and the new location table. The reorganizationpointers are for indicating how far reorganization of the location tableand blocks has progressed.

In the invention as it concerns primary keys, when retrieving, storing,updating or deleting a record with a primary key during reorganization,the target primary key value is compared with the primary key value ofthe record contained in the primary block and overflow block of theentry that the reorganization pointer is pointing to, and if the targetkey value is greater than or equal to the primary key value of therecord stored in the block that the reorganization pointer is pointingto, the current location table is used to retrieve the target record,and if the target key value is less than that primary key value, the newlocation table is used to retrieve the target record.

Here, when the current location table is used to retrieve a targetrecord, a binary search is performed on the range between thereorganization pointer and the final pointer of the current locationtable. A final pointer is reserved for the location table in advance,its purpose being to indicate through which entry the location table isused.

On the other hand, when the new location table is used to retrieve atarget record, a binary search is performed on the range between thehead pointer of the new location table and the reorganization pointer ofthe new location table.

In this way, the use of reorganization pointers in the invention as itconcerns primary keys allows the retrieval of target records duringreorganization.

Since the updating, addition, insertion and deletion of records alsofirst require finding the target block, these operations may also beachieved with the same logic as that described above.

Outline of the Invention As It Concerns Alternate Keys

The invention as it concerns alternate keys consists in thereorganization of alternate key tables.

Although alternate-key tables in the data storage and retrieval systemhad a format consisting of alternate-key blocks alone, in the presentinvention alternate-key location tables are added to the alternate-keytables and reorganization is performed on the alternate-key tables andthe alternate-key location tables. In the invention as it concernsalternate keys, the reorganization of alternate-key tables is handled asfollows. In the invention as it concerns alternate keys, the means forsolving the problem for the reorganization of alternate-key tables issimilar to the means for the reorganization of location table andblocks.

In the data storage and retrieval system proposed by the inventors,alternate-key entry comprising alternate-key value and the primary keyvalue of its record may be stored in the order of their alternate keys,alternate-key blocks are used that may be reserved contiguously inadvance in an identical size and in the quantity required, alternate-keytable entries are stored in alternate-key blocks in the order of theiralternate keys, alternate-key table entries having identical alternatekeys are stored in the same alternate-key block, alternate-key overflowblocks are added to alternate-key blocks to store entries when a largenumber of entries have an identical alternate key or when the insertionof an alternate key cannot be accommodated in an alternate-key block,and one level or more of pre-alternate-key blocks having the samestructure as the alternate-key blocks may be used when the initialnumber of records is fewer than the number of records intended finallyto be stored.

Also in the data storage and retrieval system proposed by the inventors,alternate-key tables are stored themselves in contiguous regions, andtarget alternate-key blocks are retrieved by performing binary searcheson the alternate-key tables. The invention as it concerns alternate keysis a method allowing greater efficiency of reorganization by newlyadding alternate-key location tables to these alternate-key tables.

Alternate-key tables here are comprised of alternate-key blocks andalternate-key overflow blocks.

The Invention As It Concerns Alternate Keys

The invention as it concerns alternate keys consists in thereorganization of alternate-key table with respect to alternate-keytables and formats that maintain alternate-key location tables and isspecifically as described below.

In the invention as it concerns alternate keys, an alternate-keylocation table entry holds the number and the address of thealternate-key block that entry points to and, as necessary, either oneor both of the minimum and maximum primary key values of records storedin the alternate-key block that entry points to and in all alternate-keyoverflow blocks linked to that alternate-key block.

In the invention as it concerns alternate keys, new alternate-keylocation table are created for current alternate-key location table, andcurrent alternate-key location table entries are sequentiallytransferred to the new alternate-key location tables. In the thirdapplication of the invention, when this sequential transfer is effected,alternate-key overflow blocks are delinked, and new entries are added tothe new alternate-key location table and rendered alternate-key blocksin the new alternate-key location table.

In the invention as it concerns alternate keys, the elimination offragmentation is performed as follows. The elimination of fragmentationis achieved by finding the rate of space used in multiple alternate-keyblocks and alternate-key overflow blocks, transferring records betweenalternate-key blocks and alternate-key overflow blocks among multiplealternate-key blocks and alternate-key overflow blocks, and, asnecessary, adding new alternate-key blocks or rendering unusedalternate-key blocks and alternate-key overflow blocks that had been inuse and deleting them from alternate-key location table entries.

The reservation of suitable storage rates is performed at the same timeas the elimination of overflow blocks and the elimination offragmentation, and consists in the transfer of entries so that the spacetaken up by entries in blocks is then an initial storage rate.

The above operations are performed with one or multiple alternate-keyblocks placed under exclusion.

When reorganization is performed in this fashion, although access to oneor multiple alternate-key blocks is delayed during the unit-processinginterval, alternate-key blocks other than these may be accessed. Thestate of delayed access to a few or multiple alternate-key blocks alsooccurs in regular record updating and does not constitute a particularproblem.

In the invention as it concerns alternate keys, reorganization pointersare used to access records during reorganization. One reorganizationpointer each is provided for the current alternate-key location tableand the new alternate-key location table. The reorganization pointersare for indicating how far reorganization of the alternate-key locationtable and alternate-key blocks has progressed.

In the invention as it concerns alternate keys, when retrieving a recordwith an alternate key during reorganization, the target alternate keyvalue is compared with the alternate-key value of the record containedin the alternate-key block of the entry that the reorganization pointeris pointing to, and if the target key value is greater than or equal tothe alternate-key value of the record stored in the alternate-key blockthat the reorganization pointer is pointing to, the currentalternate-key location table is used to retrieve the target entry, andif the target key value is less than that alternate-key value, the newalternate-key location table is used to retrieve the target entry.

Here, when the current alternate-key location table is used to retrievea target record from an alternate key undergoing reorganization, abinary search is performed on the range between the reorganizationpointer and the final pointer of the current alternate-key locationtable.

On the other hand, when the new alternate-key location table is used toretrieve a target record from an alternate key undergoingreorganization, a binary search is performed on the range between thehead pointer of the new alternate-key location table and thereorganization pointer of the new alternate-key location table.

In this way, target entries may be retrieved in the invention as itconcerns alternate keys. Since the updating, addition, insertion anddeletion of entries also first require finding the target alternate-keyblock, these operations may also be achieved with the same logic as thatdescribed above.

Thus, the present invention is comprised of the following.

-   -   (1) A database reorganization system that is a computerized        system that uses blocks sequentially storing records having one        unique primary key and zero or one or more non-unique alternate        keys, manages the locations of these blocks by means of location        tables that place them in correspondence with addresses in        random access memory and manages a database stored in that        random access memory, wherein a first means is provided of        creating a new location table added to an existing location        table upon receiving a database reorganization command and a        second means is provided of, during a unit-processing interval,        sequentially transferring entries in one or multiple blocks from        the current location table to the new location table and        delinking overflow blocks that are identified, adding new        entries to the new location table and rendering them as primary        blocks in the new location table when sequentially performing        transfers.    -   (2) A database reorganization system that is a computerized        system that uses blocks sequentially storing records having one        unique primary key and zero or one or more non-unique alternate        keys, manages the locations of these blocks by means of location        tables that place them in correspondence with addresses in        random access memory and manages a database stored in that        random access memory, and is provided with means of moving the        records of adjacent blocks to eliminate fragmentation when        storage rates in those blocks fall outside a prescribed range of        values.    -   (3) The database reorganization system of (1) and (2) above        wherein a means is provided of providing reorganization pointers        to each of the current location table and the new location        table, storing in each of those reorganization pointers the        location at which the reorganization processing ended for one or        multiple blocks during a unit-processing interval and completing        reorganization processing when reorganization reaches a final        pointer.    -   (4) A database reorganization system wherein a comparison means        is provided that, when retrieving a record with a primary key        during reorganization, evaluates whether the target primary key        value is greater than or less than the primary key of the record        contained in the primary block and overflow blocks that the        reorganization pointer is pointing to, and a retrieval means is        provided that, when the target primary key is evaluated by the        comparison means to be greater than or equal to the primary key        of the record stored in the block that the reorganization        pointer is pointing to, uses the current location table to        retrieve the target record and, when the when the target primary        key is evaluated to be less than that primary key, uses the new        location table to retrieve the target record.    -   (5) A database reorganization system that is a data retrieval        and storage system that may sequentially store, in the order of        their alternate keys, multiple entries made up of the numbers of        blocks storing records of alternate keys and their alternate-key        values and the primary keys of those records, uses alternate-key        blocks storing those entries, stores the entries of        alternate-key tables in alternate-key blocks in the order of        their alternate keys, stores the entries of alternate-key table        that have identical alternate keys in identical alternate-key        blocks, and adds alternate-key overflow blocks to an        alternate-key block and stores entries there when a large number        of entries have an identical alternate key or when the insertion        of an alternate key cannot be accommodated in an alternate-key        block, wherein a first means is provided of creating a new        alternate-key location table added to an existing alternate-key        location table upon receiving a database reorganization command        and a second means is provided of, during a unit-processing        interval, sequentially transferring entries in one or multiple        blocks from the current alternate-key location table to the new        alternate-key location table and delinking alternate-key        overflow blocks that are identified, adding new entries to the        new alternate-key location table and rendering them as        alternate-key blocks in the new alternate-key location table        when sequentially performing transfers.    -   (6) A database reorganization system that is a data retrieval        and storage system that may sequentially store, in the order of        their alternate keys, multiple entries made up of the numbers of        blocks storing records of alternate keys and their alternate-key        values and the primary keys of those records, uses alternate-key        blocks that may be reserved contiguously in advance in the        number required and in an identical size, stores the entries of        alternate-key tables in alternate-key blocks in the order of        their alternate keys, stores the entries of alternate-key tables        that have identical alternate keys in identical alternate-key        blocks, and adds alternate-key overflow blocks to an        alternate-key block and stores entries there when a large number        of entries have an identical alternate key or when the insertion        of an alternate key cannot be accommodated in an alternate-key        block, wherein means is provided of moving the records of        adjacent alternate-key blocks to eliminate fragmentation when        storage rates in those alternate-key blocks fall outside a        prescribed range of values.    -   (7) The database reorganization systems of (4) and (5) above a        means of providing a reorganization pointer to each of the        current alternate-key location table and the new alternate-key        location table, and storing in each of those reorganization        pointers the location at which the reorganization processing        ended during a unit-processing interval.    -   (8) A database reorganization system wherein a comparison means        is provided that, when retrieving a record with an alternate key        undergoing reorganization, evaluates whether the target primary        key value is greater than or less than the alternate key of the        record contained in the alternate-key block of the entry that        the reorganization pointer is pointing to, and a retrieval means        is provided that, when the target alternate key is evaluated by        the comparison means to be greater than or equal to the        alternate key of the entry stored in the alternate-key block        that the reorganization pointer is pointing to, uses the current        alternate-key location table to retrieve the target record and,        when the when the target alternate key is evaluated to be less        than that alternate key, uses the new alternate-key location        table to retrieve the target record.    -   (9) A database system capable of storage in a computerized        database system of multiple entries comprising the number of the        blocks storing alternate keys and the primary keys of those        records, wherein alternate-key blocks are used that may be        reserved contiguously in advance in the number required and in        an identical size, alternate-key location tables are used to        manage the location of the alternate-key blocks in storage        devices by placing numbers assigned to the alternate-key blocks        in correspondence with physical locations in the storage        devices, alternate-key entries are stored in the alternate-key        blocks in the order of their alternate keys, new alternate-key        overflow blocks are allocated to store alternate-key entries        when these cannot be stored in the alternate-key blocks, and the        locations of the alternate-key blocks in the storage devices are        managed by means of the alternate-key location tables.    -   (10) A database system that is a computerized system that uses        blocks sequentially storing records having one unique primary        key and zero or one or more non-unique alternate keys, manages        the locations of these blocks by means of location table that        place them in correspondence with addresses in random access        memory, has means of moving the records of adjacent blocks to        eliminate fragmentation when storage rates in those blocks fall        outside a prescribed range of values, and manages databases        stored in that random access memory, wherein contiguous regions        are used to store the addresses of unused blocks resulting from        the elimination of fragmentation and pointers that point to the        starting locations and ending locations of those regions.    -   (11) A database that is a computerized system that uses blocks        sequentially storing records having one unique primary key and        zero or one or more non-unique alternate keys, manages the        locations of these blocks by means of location tables that place        them in correspondence with addresses in random access memory        and manages databases stored in that random access memory,        wherein each block retains the rate of space utilization in that        block.    -   (12) A database reorganization system that is a computerized        system that uses blocks sequentially storing records having one        unique primary key and zero or one or more non-unique alternate        keys, manages the locations of these blocks by means of location        table that place them in correspondence with addresses in random        access memory and manages databases stored in that random access        memory, wherein a first means is provided of creating a new        location table added to an existing location table upon        receiving a database reorganization command, a second means is        provided of, during a unit-processing interval, sequentially        transferring entries in one or multiple blocks from the current        location table to the new location table and delinking overflow        blocks that are identified, adding new entries to the new        location table and rendering them as primary blocks in the new        location table when sequentially performing transfers, and a        third means is provided of, during a unit-processing interval,        sequentially transferring one or multiple blocks from the        current location table to the new location table.    -   (13) A database reorganization system that is a secondary system        that uses blocks sequentially storing records having one unique        primary key and zero or one or more non-unique alternate keys,        manages the locations of these blocks by means of location        tables that place them in correspondence with addresses in        random access memory and updates its own data with log data        transmitted from a primary system, wherein a first means is        provided of creating a new location table added to an existing        location table upon receiving a database reorganization command        and a second means is provided of, during a unit-processing        interval, sequentially transferring entries in one or multiple        blocks from the current location table to the new location table        and delinking overflow blocks that are identified, adding new        entries to the new location table and rendering them as primary        blocks in the new location table when sequentially performing        transfers.    -   (14) A database reorganization system that is a secondary system        that uses blocks sequentially storing records having one unique        primary key and zero or one or more non-unique alternate keys,        manages the locations of these blocks by means of location table        that place them in correspondence with addresses in random        access memory and updates its own data with log data transmitted        from a primary system, wherein a first means is provided of        creating a new location table added to an existing location        table upon receiving a database reorganization command and a        second means is provided of, during a unit-processing interval,        sequentially transferring entries in one or multiple blocks from        the current location table to the new location table and        delinking overflow blocks that are identified, adding new        entries to the new location table and rendering them as primary        blocks in the new location table when sequentially performing        transfers.

BRIEF DESCRIPTION OF THE DRAWINGS

FIG. 1 is a block diagram of an example of a primary system in which isapplied the database reorganization system that is an embodiment of theinvention as it concerns primary keys.

FIG. 2 is an outline of only that part of the primary system depicted inFIG. 1 in which is applied the database reorganization system that is anembodiment of the invention as it concerns primary keys.

FIG. 3 illustrates the operation of the database reorganization systemthat is an embodiment of the invention as it concerns primary keys.

FIG. 4 illustrates the operation of reorganization that eliminatesfragmentation in the database reorganization system that is anembodiment of the invention as it concerns primary keys.

FIG. 5 illustrates a method of eliminating overall fragmentation in thedatabase reorganization system that is an embodiment of the invention asit concerns primary keys.

FIG. 6 illustrates a method of eliminating overall fragmentation in thedatabase reorganization system that is an embodiment of the invention asit concerns primary keys.

FIG. 7 illustrates data retrieval and read/write operations duringreorganization in the database reorganization system that is anembodiment of the invention as it concerns primary keys.

FIG. 8 illustrates operation when reorganization advances during aretrieval operation in the database reorganization system that is anembodiment of the invention as it concerns primary keys.

FIG. 9 illustrates the database reorganization system that is anembodiment of the invention as it concerns alternate keys.

FIG. 10 illustrates an alternate-key table in a primary system in thedatabase reorganization system that is an embodiment of the invention asit concerns alternate keys.

FIG. 11 illustrates methods of reorganization in the databasereorganization system that is an embodiment of the invention as itconcerns alternate keys.

FIG. 12 illustrates the elimination of fragmentation in the databasereorganization system that is an embodiment of the invention as itconcerns alternate keys.

FIG. 13 illustrates the elimination of overall fragmentation in thedatabase reorganization system that is an embodiment of the invention asit concerns alternate keys.

FIG. 14 illustrates the reutilization of blocks in the databasereorganization system that is an embodiment of the invention as itconcerns alternate keys.

FIG. 15 illustrates operation when reorganization has advanced while analternate-key search is ongoing and when the retrieval-initiationposition of the current reorganization pointer and theretrieval-completion position of the current reorganization pointer aredifferent in the database reorganization system that is an embodiment ofthe invention as it concerns alternate keys.

FIG. 16 illustrates exclusion of a location table in the databasereorganization system that is an embodiment of the invention as itconcerns either primary keys or alternate keys.

FIG. 17 is a flowchart illustrating operation in a synchronoustightly-coupled data backup and recovery system that is employed in theinvention as it concerns either primary keys or alternate keys.

FIG. 18 is a flowchart illustrating operation in an asynchronousloosely-coupled data backup and recovery system that is employed in theinvention as it concerns either primary keys or alternate keys.

FIG. 19 illustrates the transfer of blocks during reorganization.

FIG. 20 is a flowchart of reorganization.

FIG. 21 illustrates the execution of reorganization where the primarysystem and secondary system are asynchronous.

REFERENCE NUMERALS IN DRAWINGS

-   1 Primary system-   2 Secondary system-   10 Blocks-   11 Alternate-key table-   12 Primary blocks-   13, 14 Overflow blocks-   15, 16 Alternate-key overflow blocks-   17 Alternate-key blocks-   LC Current location table-   LN New location table-   AAC Current alternate-key table-   MN New alternate-key table-   AALC Current alternate-key location table-   AALN New alternate-key location table-   UBAT, UABAT Unused-block allocation tables

PREFERRED EMBODIMENTS OF THE INVENTION

A description follows, making reference to the drawings, of embodimentsof the invention as it concerns primary keys and alternate keys. Priorto this description is a description of the basis of the invention as itconcerns primary keys and alternate keys.

Basis of the Invention As It Concerns Primary Keys and the Invention asIt Concerns Alternate Keys

The objective of the present invention as it concerns primary keys andthe invention as it concerns alternate keys is to build on the conceptsof the inventions specified in Japanese Patent publication numberHei11-031096 and U.S. Pat. No. 6,415,375, employing their corecomponents unmodified, and implement automatic reorganization withoutinterrupting the operation of the data storage and retrieval system.

Likewise, in the data backup and recovery system specified by thepresent inventors in Japanese Patent publication number 2001-356945, aprimary system is either provided one or more secondary systems, eachcomprising a set of location tables, blocks and alternate-key tables, orblocks alone are maintained and backed up and that backup is used forrecovery. We explain how the invention as it concerns primary keys andthe invention as it concerns alternate keys may be applied as well tothis data backup and recovery system.

A description follows of the data backup and recovery system specifiedin Japanese Patent publication number Hei11-031096. The firstcharacteristic of the data backup and recovery system specified inJapanese Patent publication number Hei11-031096 is the use of flat(non-hierarchical) tables called location table to manage blocks(primary blocks and overflow blocks) that store records. The primary-keyvalue of a record stored in a block is smaller than the primary-keyvalue of a record stored in the block following it. Records are storedin blocks in the order of their primary keys. This applies withinprimary blocks and within overflow blocks linked to a primary block, andlikewise between a primary block and any overflow blocks linked to it.This gives improved efficiency when retrieving a record in a block.

As a data type, a “primary key” is a unique key and one is required foreach record. The primary key in an employee master database may be theemployee code, for example, and in a customer master database it may bethe customer code. As a data type, an “alternate key” is a non-uniquekey and multiple kinds of alternate keys may exist within a record.Alternate keys in an employee master database might be employee name,posting or date of employment, for example.

When a record is inserted into a block but cannot be stored in thatblock, an overflow block is added and the record is stored by using thetwo contiguously. When an overflow block becomes insufficient, anotheroverflow block is linked to that overflow block and any number ofrecords may be inserted by thus sequentially linking overflow blocks.These overflow blocks are linked to the first primary block. Locationtable are reserved in contiguous areas. Such contiguous areas are of alogical order and may be in physically separate areas. Location tablemanage only primary blocks, while overflow blocks are dependent onprimary blocks and are not managed by location table. Thus the creationof overflow blocks does not result in any structural modification of alocation table.

The second characteristic of this data storage and retrieval system isaccelerated retrieval and storage without traditional indices and withgreater efficiency of index management by means of binary searching oflocation table for primary keys in order to identify target blocks.

The third characteristic of this data storage and retrieval system isthe use of alternate-key tables, also flat tables, for alternate keys.As a data type, an “alternate key” is, as stated above, a non-unique keyand multiple kinds of alternate keys may exist within a record.Alternate keys in an employee master database might be employee name,posting or date of employment, for example.

The addition of alternate-key overflow blocks to alternate-key tableblocks when the number of key-value entries increases with the additionor modification of alternate keys gives greater efficiency inalternate-key index management without the segmenting applied totraditional indices. When a single alternate-key overflow block isinsufficient, another alternate-key overflow block may be linked to it,thus doing away with any limitation on the insertion of alternate-keyentries.

The fourth characteristic of this data storage and retrieval system isaccelerated alternate-key retrieval by means of performing binarysearches on alternate-key tables.

The fifth characteristic of this data storage and retrieval system isthat while retrieval efficiency may fall due to an increase inalternate-key overflow blocks when there are many additions andmodifications of key values in alternate-key tables, the use ofpre-alternate-key tables provides a means of maintaining thatefficiency.

In the invention as it concerns primary keys and the invention as itconcerns alternate keys, location table and alternate-key tables areused to retrieve records from data-storage files (aggregates of blocks),as described in Japanese Patent publication number Hei11-031096.

Additionally, the insertion and addition of records in blocks results inthe creation of overflow blocks, and the deletion of records results inempty space in blocks. One objective of reorganization is to maximizerecord access efficiency and storage efficiency, which both suffer frommore overflow blocks and more empty space in blocks.

Likewise, record additions, updates and deletions in alternate-keytables result in the modification of alternate-key values, andalternate-key overflow blocks are added and empty space created inalternate-key blocks and alternate-key overflow blocks. One objective ofreorganization is to maximize entry access efficiency and storageefficiency, which both suffer from more overflow blocks and more emptyspace in blocks.

In the invention as it concerns primary keys and the invention as itconcerns alternate keys, the combination (in one set) of a locationtable, a block and an alternate-key table (one of each) defined in thisdata retrieval and storage system are termed a primary system. Inreal-world applications, a primary system may be comprised of multiplesuch sets and reorganization of a single set may be applied to othersuch sets.

The invention as it concerns primary keys and the invention as itconcerns alternate keys are readily comprehended with a database inmind, but their applicability is not restricted to databases, butextends also to data storage and retrieval systems and systems ingeneral. Conventional computers load into main memory and then executeprograms and data stored in external storage devices. Thus, whereasexternal databases and internal main memory have conventionally beendiscretely separated, high-speed random access memory will likely infuture be adopted in external storage devices with the spread ofnon-volatile memory technologies. If so, there will no longer be anyreason to discriminate between external storage and internal memory. Themethod and system of the present invention may then be applied whereverdata is stored other than in external storage devices.

Objectives of and Reasons for Reorganization

The three objectives of reorganization are as follows: (i) eliminationof overflow blocks, (ii) the elimination of fragmentation and (iii) thereservation of suitable initial storage rates.

Descriptions follow of objectives of and reasons for reorganization (i)through (iii) above.

Objective of Reorganization 1: (i) Elimination of Overflow Blocks

Overflow blocks result from the insertion of records. When records havefilled a primary block and a further record is then to be inserted intothat primary block, the insertion cannot be performed as is. In order toallow such an insertion, an overflow block is allocated to that primaryblock, the necessary number of records moved from the primary block tothe overflow block and the object record then inserted into the originalprimary block to make the insertion of the record possible.

However, since record retrieval necessitates a greater load whenoverflow blocks exist than when primary blocks alone exist, overflowblocks must be made over into primary blocks and managed from locationtable in order to achieve faster access.

Also, since records are stored in blocks in the order of their primarykeys, the insertion of a record requires the movement of records, andthe number of records moved increases when there are many overflowblocks, resulting in efficiency problems.

Objective of Reorganization 2: (ii) Elimination of Fragmentation

Next is the elimination of fragmentation. Fragmentation consists indispersion in storage regions. When a record stored in a block (either aprimary block or an overflow block) is no longer required and is erased,the space it once occupied in the storage region of the block is thenempty. The storage region will remain empty and go wasted unless thereis a record to insert. In order to eliminate such wasted storageregions, records stored in a subsequent block are moved to the anteriorblock so that records are sufficiently stored in that block and sopermit the elimination of waste in the use of storage regions.

As well, where an overflow block is generated and few records are storedin that overflow block relative to its storage capacity, that emptyspace will go unused as wasted storage region unless record insertionssubsequently occur.

Considering the matter on the level of blocks, when blocks that had beenused are used no longer and blocks are thus mixed together in thestorage region and are not reused, the term fragmentation may further beapplied to the storage region as a whole. The basis of the inventionaddresses solutions for the reutilization of unused blocks separatelyfrom the elimination of fragmentation within blocks.

Objective of Reorganization 3: (iii) Reservation of Suitable InitialStorage Rates

Next is the reservation of suitable initial storage rates. Initialstorage rates are used to prevent to some extent the generation ofoverflow blocks by leaving a certain proportion of the space in a blockempty when first creating the block and writing records to it.

When initially storing records in a block, these records may be storedat 100% of the capacity of the block, but thus storing records at thefull capacity of the block will result in the generation of an overflowblock immediately when a record is inserted. Not only do overflow blockslead to lower retrieval efficiency, they are also a cause of thefragmentation described above when overflow blocks store few records andhave large amounts of empty space. In order to avoid this development,when first storing records in a primary block, records are stored up toa limit of, for example, 90% of the storage capacity and that emptyspace is used to store records when records are subsequently inserted,making it possible to prevent the immediate generation of an overflowblock.

Thus reserving a defined proportion of the space within a block as emptyfor record insertion is the approach taken with suitable initial storagerates and is a technique that is well-known.

Objective of Reorganization 3, Detailed Explanation: Details of theReservation of Suitable Initial Storage Rates

A suitable initial storage rate may be applied on four occasions: (a)when initially creating a database, (b) When reserving a new block andstoring records in it by adding them to the block, (c) when insertingrecords into an existing reserved block whose space utilization rate isout of the suitable storage rate, and (d) when performingreorganization.

Cases (b) when reserving a new block and storing records in it by addingthem to the block and (c) when inserting records into an existingreserved block whose space utilization rate is out of the suitablestorage rate above, which both involve the insertion of records into theblock, are discrete instances by a strict definition, but in practicemay be implemented according to the same logic.

To explain (b) when reserving a new block and storing records in it byadding them to the block in somewhat greater detail, a new block isreserved when the block pointed to by the entry immediately precedingthe final pointer in the location table (i.e. the final entry) has astorage rate greater than that which is suitable and the record to bestored in that block has a primary-key value greater than the recordsstored in that block; a new primary block is then allocated and therecord stored in that newly allocated primary block.

When records are written to this new primary block by adding them to it,records are stored therein up to the suitable initial storage rate, andwhen the suitable storage rate is exceeded, the next primary block isallocated. After the allocation of a primary block, records are thusstored in the block until the suitable initial storage rate is reached.This applies likewise to the storage of records in that block byinsertion: such records are stored in that primary block until thesuitable initial storage rate is reached.

It goes without saying that the final pointer is sequentially advancedwhen a new primary block has been allocated.

Another instance of the allocation of new blocks is the allocation ofoverflow blocks. When storing records in an overflow block, the recordsare stored, as with primary blocks, until the suitable initial storagerate is reached. When the suitable storage rate is reached, the nextoverflow block is newly allocated and records are then stored in thatnew overflow block.

When the space utilization rate in an existing reserved block is belowthe suitable storage rate, the insertion of records into that block isperformed as described below. When a block (either a primary block or anoverflow block) storing records undergoes a deletion, the record iserased from the block and the space that it had taken up is then empty.This empty space may be reckoned wasted if the space utilization rate ofthe block is below its suitable initial storage rate. When records areinserted into such a block, it is advantageous in terms of the effectiveutilization of storage space to insert records until the suitableinitial storage rate is reached.

Such an approach of suitable initial storage rates is no more than ameans to determine whether or not to allocate a new block; when anoverflow block is allocated to a primary block and records inserted inthat primary block, they are stored until the storage rate reaches 100%.

The foregoing discussion focuses on location table, blocks and overflowblocks, but alternate-key blocks and alternate-key overflow blocksentail the same issues and, as discussed below, the reorganization ofalternate-key blocks is also an object of the present invention.

Statistical Techniques for the Automation of Reorganization

The utilization of statistical techniques as follows is an effectivemeans of implementing automatic reorganization. These statisticaltechniques have been used in existing methods. This enables automaticexecution by means of software.

Statistical methods are added to conventional data storage systems. Inorder to track how overflow blocks are generated for blocks, thesestatistical methods survey the values a-i through a-v below. It is notabsolutely necessary to survey all of these values, which may be usedselectively according to the goals obtaining.

-   -   a-i Total number of blocks    -   a-ii Number of primary blocks    -   a-iii Number of overflow blocks    -   a-iv Maximum number of overflow blocks per single primary block    -   a-v Standard deviation of number of overflow blocks per single        primary block.

The data obtained from b-i and b-ii below are effective in addressingfragmentation and suitable initial storage rates, but the suitableinitial storage rate pertains, as discussed above, to the initialcondition of a block and fragmentation is of greater importance. In b-iand b-ii, “blocks” includes both primary blocks and overflow blocks.

-   -   b-i Total block storage capacity    -   b-ii Block storage space actually utilized

Likewise, in order to track how alternate-key overflow blocks aregenerated for the alternate-key blocks in alternate-key tables, thevalues c-i through c-iv below are surveyed. It is not absolutelynecessary to survey all of these values, which may be used selectivelyaccording to the goals obtaining.

-   -   c-i Number of alternate-key blocks    -   c-ii Number of alternate-key overflow blocks    -   c-iii Maximum number of alternate-key overflow blocks per single        alternate-key block.    -   c-iv Standard deviation of number of alternate-key overflow        blocks per single alternate-key block.

Likewise, the data obtained from d-i and d-ii below are effective inaddressing fragmentation and suitable initial storage rates, but thesuitable initial storage rate pertains, as discussed above, to theinitial condition of an alternate-key block and the information onfragmentation is of greater importance.

-   -   d-i Total storage capacity of both alternate-key blocks and        alternate-key overflow blocks    -   d-ii Storage space actually: utilized in both alternate-key        blocks and alternate-key overflow blocks

Counts should also be made of accesses to location tables andalternate-key tables. This enables the option of not performingreorganization if access is low even with overflow.

The values above are surveyed, and reorganization is performedautomatically when predefined thresholds are reached.

As discussed below, the database reorganization system of the inventionas it concerns primary keys and the invention as it concerns alternatekeys allows reorganization to be performed during operation, withoutinterrupting the primary system. Likewise, the where data backup andrecovery system discussed above is employed, reorganization may beperformed while maintaining consistency between the primary system andthe secondary systems.

Where this automatic reorganization is employed, it should run when theoperation load is low since overall system performance will deteriorateif it is run when the operation load of the system is high, but thisautomatic reorganization system is, as discussed below, capable ofpausing reorganization at any time and later restarting. Thus, it isunnecessary to suspend operation for long periods as with conventionalmethods, and reorganization may proceed gradually at such times as whenthe system load is low.

Selection of Overflow Block Format

On Overflow Block Formats

Three possible overflow block formats are suggested for the data storageand retrieval system discussed above. These are 1) blocks of the samesize as primary blocks, 2) blocks of a different size from primaryblocks and 3) management from multiple primary blocks.

The present invention assumes the use of overflow blocks that are thesame size as primary blocks. The reasons this format is preferred arethat when overflow blocks are not of the same size as the primaryblocks, reorganization may require modification of block size and thatwhen overflow blocks that are not of the same size as the primary blocksgo unused, major restrictions are imposed on their reutilization. Nor isit preferable to manage an overflow block from multiple primary blocksbecause this increases the load incurred in reorganization.

Regions Required for Reorganization

Maintenance of Two Location Tables

The execution of automatic reorganization with this automaticreorganization system requires the maintenance, at any given time, oftwo location tables or two alternate-key tables and therefore the spacerequisite for them.

The example described below is for three types (A, B and C) ofalternate-key tables. (See FIG. 1.)

It is determined whether it is necessary to reorganize the locationtable LC and any one of the alternate-key tables A, B and C. Thisconsists of examining such values, described above, as the number ofoverflow blocks and their standard deviation and applying definedthreshold values. When performing reorganization of the location tableand also one or more of the alternate-key tables, reorganization is notperformed simultaneously on all of these, but first reorganization isperformed on the location table and next on the alternate-key tablesindividually and one by one. When performing reorganization on each ofA, B and C, first reorganization is performed on A and when thatreorganization has completed, reorganization is performed on B and whenthe reorganization of B has completed, reorganization is performed on C.

That is, space is required for two tables, the old table and the newtable, but not simultaneously for the old and new location table and theold and new alternate-key tables all at once; rather, space is requiredonly for the old and new tables of the table undergoing reorganizationat any given time and so the amount of space required is not great.

Preferred Embodiment of the Invention As It Concerns Primary Keys

FIG. 1 is a block diagram of an example of a primary system in which isapplied the database reorganization system that is an embodiment of theinvention as it concerns primary keys.

In FIG. 1 the primary system 1 is comprised of the location table LC,blocks 10, alternate-key table 11A, alternate-key table 11B andalternate-key table 11C. A secondary system of identical structure,while not depicted in the drawing, also exists but is omitted from thedescription in order to facilitate the description.

As shown in FIG. 1, the location table LC shows the positions of theblocks 10 at the block numbers given in the location table LC.

FIG. 2 is an outline of only that part of the primary system depicted inFIG. 1 in which is applied the database reorganization system that is anembodiment of the invention as it concerns primary keys.

In FIG. 2 the blocks 10 are divided into primary blocks 12, overflowblocks 13 and overflow blocks 14 and are presented in the context of theprimary system 1 in order to indicate their configuration with thelocation table LC. In other words, the primary system 1 is comprised ofthe location table LC, the primary blocks 12, the overflow blocks 13 andthe overflow blocks 14.

The first entry in the location table LC references block number 0 inthe primary blocks 12.

The second entry in the location table LC references block number 1 inthe primary blocks 12, and this primary blocks 12 references blocknumber 1-2 in the overflow blocks 13, which in turn references blocknumber 1-3 in the overflow blocks 14.

The third entry in the location table LC references block number 2 inthe primary blocks 12.

The fourth entry in the location table LC references block number 3 inthe primary blocks 12.

The fifth entry in the location table LC references block number 4 inthe primary blocks 12.

The sixth entry in the location table LC references block number 5 inthe primary blocks 12, and this primary block 13 references block number5-2 in the overflow blocks 13.

The seventh entry in the location table LC references block number 6 inthe primary blocks 12, and this primary block 12 references block number6-2 in the overflow blocks 13, which in turn references block number 6-3in the overflow blocks 14.

The eighth entry in the location table LC references block number 7 inthe primary blocks 12.

The ninth entry in the location table LC references block number 8 inthe primary blocks 12.

Below, the location table LC is reckoned to reference the individualprimary blocks 12.

In the primary system 1 of FIG. 2, overflow blocks are generated forblock number 1, block number 2, block number 5 and block number 6 amongthe primary blocks 12.

Reorganization of the location table LC and the primary blocks 12, theoverflow blocks 13 and the overflow blocks 14 in such a primary system Iwould be performed as follows.

The description following of the operation of the databasereorganization system in an embodiment of the invention as it concernsprimary keys is based on FIG. 2 and makes reference also to FIG. 3.

FIG. 3 here illustrates the operation of the database reorganizationsystem that is an embodiment of the invention as it concerns primarykeys.

Location Table and Block Reorganization: Elimination of Overflow

First we describe the reorganization of the location table LC.

“Reorganization of the location table LC” in FIG. 2 and FIG. 3 is theprocedure described below.

The blocks 10 managed by the location table LC are only the primaryblocks 12. The overflow blocks 13 are managed by the primary blocks 12.In other words, the primary blocks 12 maintain the addresses of theoverflow blocks 13. Therefore, when using the location table LC toretrieve data with a primary key, if a binary search of the locationtable LC returns block 15 among the primary blocks 12, it is thennecessary to find the object record within that block 15. When multipleoverflow blocks 13 and 14 are linked to the primary blocks 12, it takeslonger, to the extent of the number of overflow blocks 13 (14, etc.), tofind a record than it does when only the primary blocks 12 (in FIG. 1,for example, block number 0, block number 3, block number 4, blocknumber 7 and block number 8) exist.

In order to avoid this, the embodiment of the invention as it concernsprimary keys eliminates overflow blocks and makes all overflow blocks 13and 14 over into primary blocks (12), thus permitting records to befound in less time by managing them with the location table (LC or LN).

Furthermore, since records are held in the blocks (10) in the order oftheir primary keys, the insertion of a record requires that records bemoved, and a large number overflow blocks (13, 14) results in a largenumber of records to be moved, which entails issues of efficiency.

Using Two Location Tables. One Current and One New to PerformReorganization

As shown in FIG. 3, the invention as it concerns primary keys uses twolocation tables, the current location table LC a rid the new locationtable LN, to perform reorganization.

In the invention as it concerns primary keys a count is made of thenumber of overflow blocks (13, 14 . . . ) generated and so the sum ofthe number of entries in the location table LC and the number ofoverflow blocks (13, 14 . . . ) is the number of entries in the newlocation table LN. In this specification, the location table in use atthe time of reorganization is referred to below as “LC” and the newlocation table referred to below as “LN”. Since the number of entriesmay increase due to insertions during reorganization and will alsoincrease after reorganization with the addition of data, a numbergreater than that actually required should be reserved.

However, as described below, since the number of blocks required mayalso decrease with the elimination of fragmentation and the number ofblocks required varies with the reservation of a suitable initialstorage rate, the optimal method is to calculate the figure on the basisof the number of records stored and the suitable initial storage rate.

In the embodiment of the invention as it concerns primary keys, acontiguous region is reserved by a first means on the primary system 1that is sufficient to hold the capacity of the new location table LN.

Once the region for the new location table LN is reserved, the entriesin the current location table LC are, as shown in FIG. 3, sequentiallywritten over to the new location table LN, which procedure is describedbelow.

The Reorganization Pointers

In the embodiment of the invention as it concerns primary keys, thefollowing operation is performed by a second means. First,reorganization pointer are created. These reorganization pointersindicate through which entry in the location tables (LC and LN)reorganization has completed, and so two such pointers are provided, onefor the current location table LC and one for the new location table LN.The reorganization pointer for the current location table LC is termedRPLC, and the reorganization for the new location table LN termed RPLN.

Here, the initial value of the current location table LC reorganizationpointer RPLC is the first address in the current location table LC, andthe initial value of the new location table LN reorganization pointerRPLN is the first address in the new location table LN.

The first, second, third and so on entries in the location table LC ofthe primary system 1 reference, as described above, specific blocknumbers in the primary blocks 12.

In FIG. 3, the first entry of the reorganization pointer RPLC, theprimary block 12 (block number 0) managed by that entry and its overflowblocks are first of all placed under exclusion. Overflow blocks do notexist in this case and so only the primary block 12 (block number 0) isaffected.

Next, the first entry (block number 0) is written over (S1 in FIG. 3)from the current location table LC to the new location table LN. Whendoing so, a check is made whether any overflow blocks are linked to theblock 12 (block number 0) managed by the first entry (block number 0).If not, the addresses of the (current) reorganization pointer RPLC andthe (new) reorganization pointer RPLN are changed to point to thebeginning of the second entry.

Since no overflow blocks are linked to the block number 0 primary block12 in FIG. 3, both the (current) reorganization pointer RPLC and the(new) reorganization pointer RPLN are changed to point to the beginningof the second entry. Exclusion is lifted on the first entry of the(current) reorganization pointer RPLC, the primary block 12 (blocknumber 0) managed by that entry and its overflow blocks. Overflow blocksdo not exist in this case and so only the primary block is affected.

Next, the second entry (block number 1) is processed. The second entryof the location table LC, the primary block 12 (block number 1) managedby that entry and the overflow block 13 (block number 1-2) and theoverflow block 14 (block number 1-3) are placed under exclusion. The twooverflow blocks 13 and 14 are linked to the primary block 12 managed bythe second entry in the current location table LC. Cases in whichoverflow blocks 13 and 14 are thus linked to a primary block 12 andhandled as follows. The second entry in the current location table LC iswritten over (S2 in FIG. 3) to the second entry in the new locationtable LN. Rather than performing a simple write operation, the low valueand the high value of the primary key of the record stored in the blockare modified. Where the location table LC and LN entries hold the lowvalue and high value of the primary keys of the blocks (12, 13 and 14),the low value and high value of the primary keys of the entries in thenew location table LN will fail to match the low value and high value ofthe primary keys of the records stored in the blocks (12, 13 and 14) ifa simple write operation is performed and so this is avoided.

Assume a low value of 0000 and a high value of 0299 for the primary keyvalue of the second entry in the current location table LC. Given then alow value of 0000 and a high value of 0099 for the primary key value ofthe record stored in the primary block 12, a low value of 0100 and ahigh value of 0199 for the primary key value of the record stored in thefirst overflow block 13, and a low value of 0200 and a high value of0299 for the primary key value of the record stored in the secondoverflow block 14, the operation proceeds as follows.

The low value of the primary key value of the second entry in the newlocation table LN is 0000 and its high value 0099. Since the address ofthe primary block 12 (block number 1) is given to the address of theblock 10, it takes (S3 in FIG. 3) the same value as the address of thecurrent location table LN. Next, the address of the first overflow block13 (block number 1-2) and the low value 0100 and the high value 0199 ofits primary key are assigned (S3 in FIG. 3) to the third entry in thenew location table LN.

The address of the second overflow block 14 (block number 1-3) isassigned to the fourth entry in the new location table LN, and itsprimary key assigned the low value of 0200 and the high value of 0299.(S4 in FIG. 3)

Next, the overflow block address in the primary block 12 (block number1) is reset and the overflow block 13 (block number 1-2) delinked fromthe primary block 12 (block number 1). (S5 in FIG. 3) Next, the overflowblock address in the first overflow block 13 is reset and the secondoverflow block 14 (block number 1-3) delinked from the first overflowblock 13 (block number 1-2). (S6 in FIG. 3)

Next, the (current) reorganization pointer RPLC is updated to point tothe third entry in the current location table LC, and the (new)reorganization pointer RPLN updated to point to the fifth entry in thenew location table LN.

Next, exclusion is lifted on the second entry of the (current)reorganization pointer RPLC, the primary block 12 (indicated as blocknumber 1 in FIG. 3) managed by that entry and the overflow block 13(indicated as block number 1-2 in FIG. 3) and the overflow block 14(indicated as block number 1-3 in FIG. 2).

Next, the third entry (block number 2) is processed. The third entry inthe current location table LC, the primary block 12 (block number 2)managed by that entry and the overflow block 13 (block number 2-2) areplaced under exclusion.

The fifth entry in the new location table LN is assigned the address ofthe primary block 12 (block number 2) and so takes (S7 in FIG. 3) thesame value as the address of the current location table LN. Next, thelow value and the high value of the primary key value of the address ofthe first overflow block 13 (block number 2-2) are assigned (S8 in FIG.3) to the sixth entry in the new location table LN.

Next, the overflow block address in the primary block 12 (block number2) is reset and the overflow block 13 (block number 2-2) delinked fromthe primary block 12 (block number 2). (S9 in FIG. 3)

Next, the (current) reorganization pointer RPLC is updated to point tothe fourth entry in the current location table LC, and the (new)reorganization pointer RPLN updated to point to the sixth entry in the(new) location table LN.

Next, exclusion is lifted on the fourth entry of the (current)reorganization pointer RPLC, the primary block 12 (indicated as blocknumber 2 in FIG. 3) managed by that entry and the overflow block 13(indicated as block number 2-2 in FIG. 3). Next, the fourth entry of thereorganization pointer RPLC, the primary block 12 (block number 3)managed by that entry and its overflow blocks are placed underexclusion. Overflow blocks do not exist in this case and so only theprimary block 12 (block number 3) is affected.

Next, the first entry (block number 3) is written over (S10 in FIG. 3)from the current location table LC to the new location table LN. Whendoing so, a check is made whether any overflow blocks are linked to theblock 10 (block number 3) managed by the first entry (block number 3).Since none are, the (current) reorganization pointer RPLC is changed topoint to the beginning of the th entry. And the address of the (new)reorganization pointer RPLN is changed to point to the beginning of theseventh entry.

Exclusion is lifted on the fourth entry of the (current) reorganizationpointer, the primary block 10 (block number 3) managed by that entry andits overflow blocks. Overflow blocks do not exist in this case and soonly the primary block is affected. This procedure permitsreorganization without rewriting the overflow blocks to anotherlocation.

Reorganization then proceeds sequentially in the same fashion from thefifth entry (block number 5) onwards.

FIG. 3 illustrates the state in which reorganization has completedthrough the fourth entry of the current location table LC.

The value of the (new) reorganization pointer RPLN in FIG. 3 points tothe beginning of the eighth entry (block number 7) in the location tableLN.

Problems Arising from the Format of Entries in Alternate-Key Tables andTheir Resolution

When performing the reorganization procedure described above, it isnecessary to rewrite alternate-key table entries. This varies with theformat of entries in alternate-key tables.

Differences in Methods Varying with the Format of Entries inAlternate-Key Tables

In the data storage and retrieval system specified above, entries inalternate-key tables (cf. 11A, 11B and 11C in FIG. 1) are made up of analternate key, the physical address of the block in which the record ofthat key value is stored and the primary key of the record of that keyvalue, and the drawing illustrates a method made up of an alternate key,the block number of the block in which the record of that key value isstored and the primary key of the record of that key value.

First, consider an alternate-key table whose entries are made up of analternate key, the block number of the block in which the record of thatkey value is stored and the primary key of the record of that key value.

After the operation described above, the alternate-key table entries ofthe records that had been stored in the former first and former secondoverflow blocks are modified. If the numbers of blocks in which recordsare stored are maintained in the entries of the alternate-key table, theblock numbers of the primary block and the overflow blocks were all 1before reorganization, but with reorganization the block number of theformer primary block becomes 1, the block number of the former firstoverflow block becomes 2 and the block number of the former secondoverflow block becomes 3. These changes are reflected in the requisiteentries of the alternate-key table.

Since this operation is performed on the alternate-key table entries ofrecords that undergo reorganization at a given point, it must be notedthat it takes more time than does the delinking of overflow blocks.

Addition of New Alternate-Key Table Entry Formats

It is advantageous in reorganization to adopt a format. In brief, blocknumbers are maintained in the entries of alternate-key tables. Assigningan alternate-key value and the primary key value of the record of thatkey value to the entries that maintain block numbers in an alternate-keytable eliminates the need to rewrite alternate-key table entries inreorganization and permits alleviation of the load entailed byreorganization. If this method is adopted, however, retrieval usingalternate keys will be time-consuming since target blocks will not beidentified unless a search is run on the location table with a primarykey value after that primary key is first obtained from a search of thealternate-key table.

Next, we explain for an alternate-key table whose entries are made up ofan alternate key, the physical address of the block in which the recordof that key value is stored and the primary key of the record of thatkey value.

This format has the advantages of retrieving a block directly when datais retrieved with an alternate key and, since block addresses do notchange with reorganization, a lower load during reorganization than withthe use of block numbers.

While possessed of these advantages, this format also suffers from theshortcomings described below and so due consideration must be given toits utilization. First, if the blocks in which records are stored changewith the elimination of fragmentation, it will be necessary to rewritethe block addresses of the alternate-key table entries relating to thoserecords, which procedure will constitute a considerable load. Changes inthe blocks in which records are stored will also result from theaddition of overflow blocks when records are inserted and newly movingrecords to overflow blocks.

Next, as described below, there is a greater possibility of deadlock dueto a different order of exclusion.

Additionally, as described below, limitations are incurred whenperforming recovery with the data backup and recovery system, for whichdomestic priority has been claimed.

The frequency of location table reorganization and ease of recoveryshould be taken into consideration when choosing among these formats.

Location Table and Block Reorganization: Exceptions to Elimination ofOverflow

While the foregoing description deals with the elimination of overflowblocks, that following deals with cases in which elimination cannot beperformed. These are cases in which spanned records are present. Aspanned record is one that is larger than the size of a block and so issegmented to a size that can be stored in blocks and stored acrossmultiple blocks, and is a format has long been in use.

Since a spanned record is a single record that has been segmented, eachsegment has the same primary key and it is not stored in multipleprimary blocks, but always in a single primary block and one or moreoverflow blocks or in multiple overflow blocks. Storage in multipleoverflow blocks occurs when records are already stored in the primaryblock and the overflow block immediately posterior and storage of thespanned record begins in the midst of an overflow block.

Where a spanned record is stored, maintaining that information in theblocks facilitates understanding. Information specifying whether it isthe beginning, the middle or the end of the spanned record ismaintained. Otherwise, it does not differ from the case of storage ofregular records.

In such a case, overflow blocks involving regular records may beeliminated, but since blocks in which spanned records are stored arenecessarily of a structure that entails overflow blocks, these overflowblocks may not be eliminated. In such cases, information relating tospanned records should be output as post-reorganization information.

Location Table and Block Reorganization: Elimination of Fragmentation

A method of eliminating overflow blocks has been described above, butfragmentation also presents significant problems in terms of efficiency.The description following makes reference to FIG. 4 to implementoperations similar to those used in reorganization for the eliminationof overflow blocks in order to eliminate fragmentation.

FIG. 4 illustrates the operation of reorganization that eliminatesfragmentation in the database reorganization system that is anembodiment of the invention as it concerns primary keys.

FIG. 4 assumes that records are stored moderately in block numbers 0, 1,1-2, 1-3, 2 and 2-2 in primary blocks 12, 13 and 14, and that theelimination of overflow blocks has been completed with the methoddescribed above. The block numbers used in the description following arethe block numbers in current location table LC. Where numbers from newlocation table LN are used, it is so stated.

Records are stored in block number 3 of the primary blocks 12 up to 30%of its storage capacity. Records are stored in block number 4 of theprimary blocks 12 up to 40% of its storage capacity. Records are storedin block number 5 of the primary blocks 12 up to 70% of its storagecapacity, in block 5-2 of the overflow blocks 13 up to 60% of itsstorage capacity and in block 6 of the overflow blocks 13 up to 70% ofits storage capacity. The suitable initial storage rate of each blockafter reorganization is 90%. This is to prevent the generation ofoverflow blocks immediately upon the insertion of records afterreorganization.

The reorganization system is about to reorganize block 3, which isreferenced by the fourth entry, which reorganization pointer RPLC of thecurrent location table LC is pointing to. However, its storage rate (thevolume of records stored in that block as a proportion of the capacityof the block) is 30%, which does not satisfy the suitable initialstorage rate. Therefore, attention turns to the current block number 4of the primary blocks 12. Since the storage rate of this block number 4of the primary blocks 12 is 40%, adding the two blocks together stillfalls short of the suitable initial storage rate (90%). Attention thenturns to block number 5 of the primary blocks 12, which has a storagerate of 70% and would thus exceed the suitable initial storage rate of90%.

Leaving the records stored in block number 3 of the primary blocks 12untouched, the records in block number 4 of the primary blocks 12 aremoved to block number 3 of the primary blocks 12. Furthermore, in orderto achieve the suitable initial storage rate, the first 20% of therecords stored in block number 5 of the primary blocks 12 is moved toblock 3 of the primary blocks 12, and the remaining 50% of the recordsin block number 5 of the primary blocks 12 is shifted to the beginningof that block (to the left in FIG. 4). When doing so, the alternate-keytable entries of the records shifted are revised in the same fashion asdescribed above in the elimination of overflow blocks.

The low value and high value of the primary keys of records stored inthe blocks are modified. Where current location table LC and newlocation table LN entries hold low and high values of primary keys inblocks, the low value and high value of primary keys in new locationtable LN entries are modified.

Since block 3 of the primary blocks 12 is done, the address of theseventh entry in the new location table LN is rewritten (S20 in FIG. 4)to block 3.

Reorganization pointer RPLN moves to the beginning of the eighth entryin the location table LN. At this point block 4 of the primary blocks 12becomes an unused block. (S21 in FIG. 4)

Next, the operation is performed on block 5 of the primary blocks 12,and the storage rate of block 5 of the primary blocks 12 is now 50% dueto reorganization.

Since block 5-2 of the overflow blocks 13 has a storage rate of 60%, the30% of records from the beginning of block 5-2 of the overflow blocks 13are moved to block 5 of the primary blocks 12, and the remaining recordsin block 5-2 of the overflow blocks 13 are shifted to the beginning ofthat block (moved leftwards in the drawing). The link between block 5 ofthe primary blocks 12 and block 5-2 of the overflow blocks 13 is thencut (S22 in FIG. 4). This sets the overflow block address of block fiveof the primary blocks 12 to a specific value (for example, zero). Thealternate-key table entries of the records moved are revised in the samefashion as described above in the elimination of overflow blocks.

Since block 5 of the primary blocks 12 is done, the address of theeighth entry in the location table LN is rewritten (S23 in FIG. 4) toblock 5 of the primary blocks 12. Next, since block 5-2 that had been anoverflow block 13 has a storage rate of 30% and block 6 that is the nextprimary block 12 has a storage rate of 60%, all the records of block 6that is the next primary block 12 are moved to block 5-2 that had beenan overflow block 13. Since the storage rate of block 5-2 that had beenan overflow block 13 is now a suitable one, the address of the eighthentry in the new location table LN is rewritten (S24 in FIG. 4) to block5-2 that had been an overflow block 13. The reorganization pointer RPLNmoves to the beginning of the ninth entry in the location table LN. Atthis point block 6 of the primary blocks 12 becomes an unused block.(S25 in FIG. 4)

Reservation of Suitable Initial Storage Rates

The reservation of a suitable initial storage rate may require theaddition of blocks, the converse of handling fragmentation. What thismeans is that, for example, given a space utilization rate of 100% inall blocks, achieving a suitable initial storage rate of 90% entails theaddition of one block to the nine that may exist at the point ofreorganization and storing records in each of them for a block storagerate of 90%. It may not be possible in some cases, for reasons of blockand record size, to strictly implement reservation of suitable storagerates, and since it may be necessary to subject considerably largenumbers of blocks to reorganization at particular points in time, therequisite exclusion range may expand accordingly and have an adverseeffect on system operation.

Multiple Initial Storage Rates

In order to prevent such circumstances arising, it is preferable inoperational terms to use multiple values, such as 85% to 90%, for thesuitable initial storage rate. In this case, the suitable initialstorage rate need only fall within the range of 85% to 90%.

Suitable initial storage rates may also be specified block by block,depending on the overflow-block status of each block. Suchblock-specific suitable initial storage rates may be implemented byadding an element in location table entries and specifying them there orby including an element within blocks and specifying them there. Whenreserving suitable initial storage rates, records may be rewritten fromtheir original blocks to other blocks. It is also necessary to rewriteany alternate-key table entries related to the records thus rewritten.

Reorganization in Practice and Prevention of Deadlock

As the execution of reorganization in practice consists of at onceperforming the elimination of overflow blocks, the elimination offragmentation and the reservation of suitable initial storage rates, itis a combination of these three.

In principle, a reorganization system may be implemented as describedabove, but as blocks are sequentially read and blocks created withsuitable initial storage rates, the risk of deadlock increases since theexclusion range cannot be determined from the outset and exclusionextends sequentially. Effective ways of preventing deadlock are asfollows.

The first method of preventing deadlock is to read the blocks insequence without placing them under exclusion, find their record storagerates and calculate the appropriate size for combinations of multipleblocks in order to determine the exclusion range.

The second method of preventing deadlock is to provide, in the datastorage and retrieval system described above, location table entriescapable of maintaining, in addition to block numbers and blockaddresses, either or both of the low value and high value of the primarykey values of records stored in blocks.

Adding Elements to Location Table Entries

In addition to this information, the storage rates of records or thenumber of bytes occupied by records in blocks and the number of overflowblocks linked to blocks are added to location table entries. Thiseliminates the need to read blocks, as described for the first methodabove, and allows that information to be gained simply by reading thelocation table. However, since the storage rates of records and the thenumber of bytes occupied by records in blocks changes with the insertionand deletion of records and the need arises to rewrite the locationtable, a choice should be made between the first and the second methodof preventing deadlock described above depending on the state of recordgeneration.

This information may also be maintained within blocks.

Thus, the reorganization range is determined upon finding the storagerates within blocks, and the location table entries within that range,the blocks that those entries point to and the overflow blocks linked tothose blocks are placed under exclusion. Then the number of blocks andoverflow blocks subject to reorganization and their storage capacity arefound and the actual volume of the records requiring storage is found,and then it is assessed whether the number of blocks required is equalto, greater than or less than the sum of the current number of blocksand overflow blocks. Then applying the logic described above for theelimination of overflow blocks, the elimination of fragmentation and thereservation of suitable initial storage rates, the blocks arereorganized and entries created in the new location table. In this case,the reorganization pointers move at once for the amount of blockssubject to reorganization.

Reorganization is thus performed on from one block to several tens ofblocks in a single pass, but this reorganization does not interfere withregular data processing since it appears to be treated as a regular dataprocessing transaction.

And by performing successive passes of this reorganization on thelocation table and the blocks, reorganization of the whole is completed.

Completion of Reorganization

Description follows of methods of recognizing the completion ofreorganization. A final pointer is provided to the current locationtable LC to indicate the final position used by that location table.

The final pointer is provided for the following purpose. The locationtable is reserved in a contiguous region. A method of providing aadditional location table in region discontiguous with the firstlocation table when location table entries are insufficientreorganization system can not add new additional location table incontiguous region to current the location table, converting addresses toperform binary searches as though their areas were contiguous isdescribed in the data storage and retrieval system, but is notrecommended because the load of address conversion increases with largenumbers of discontiguous areas. Therefore, a fully adequate area forlocation tables is reserved from the outset and the address of the nextentry after the entries used is pointed to in order to distinguish usedentries from unused entries. Thus, the provision of the final pointerallows the retrieval of records by means of primary keys even if unusedentries exist in the current location table LC by means of executing abinary search between the first address and the final pointer in thecurrent location table LC.

Methods of Detecting the Completion of Reorganization

The final pointer is used as an indicator. Reorganization of thelocation table and blocks is completed when reorganization runs and theaddress pointed to by the current reorganization pointer RPLC matchesthe address pointed to by the final pointer. Overflow blocks linked tothe block that the final entry points to do not represent a problembecause the current reorganization pointer RPLC does not move until thereorganization of these overflow blocks is complete.

When reorganization has completed, the current reorganization pointerRPLC is no longer needed and so the new location table LN is designatedthe current location table and the location table LC may be deleted.

FIG. 20 is a flowchart of the reorganization described above. S1 here isan instruction for reorganization. S2 represents a means of creating thenew location table. S3 evaluates the completion of reorganization. S4represents a means of examining the status of storage in blocks andallocating one or multiple blocks for reorganization. S5 evaluateswhether overflow blocks are linked to the primary block. S6 represents ameans for delinking overflow blocks and creating new entries in the newlocation table. S7 evaluates whether fragmentation obtains or not. S8represents a means for moving records between blocks, rewriting blocksand eliminating fragmentation. S9 evaluates whether suitable initialstorage rates are exceeded. S10 represents a means for creating newblocks, moving records between blocks and rewriting blocks. S11evaluates whether unused blocks exist. S12 represents a means of makingregistrations in an unused block allocation table. S13 represents ameans of transcribing entries from the current location table to the newlocation table.

Reutilization of Unused Blocks

A method of reorganization has thus been described that addressesfragmentation, but in FIG. 4 block 4 of the primary blocks 13 of thecurrent location table LC and block 6 of the primary blocks 13 are leftunused. Left as it is, this may result in failing to eliminate overallfragmentation while eliminating fragmentation within blocks. Thefollowing expedient is adopted in order to prevent this outcome.

Unused Block Allocation Table, Start-Position Pointer and End-PositionPointer

FIG. 5 and FIG. 6 assist in the description of a method for eliminatingoverall fragmentation in the database reorganization system that is apreferred embodiment of the invention as it concerns primary keys.

A method of eliminating overall fragmentation in the databasereorganization system is, as shown in FIG. 5, to use an unused blockallocation table UBAT. The unused block allocation table UBAT is a tableof the format shown in FIG. 5, and its purpose is to store the addressesof unused blocks among the blocks 10. In this method of eliminatingoverall fragmentation in the database reorganization system, twopointers are also used, a start-position pointer NABPS to indicate thestart position in the the unused block allocation table UBAT and anend-position pointer NABPE to indicate the end position in the unusedblock allocation table UBAT. FIG. 5 represents a state in which sevenunused blocks have appeared where none at all previously existed.

In their initial states, both the start-position pointer NABPS and theend-position pointer NABPE point to the beginning of the unused blockallocation table UBAT. Here, when an unused block appears, the unusedblock of the blocks 10 is registered in the unused block allocationtable UBAT entry that the end-position pointer NABPE is pointing to andthe end-position pointer NABPE is rewritten to point to the next entryin the unused block allocation table. The result of the sequentialexecution of this operation is shown in FIG. 5 illustrating the stateafter the appearance of seven unused blocks. Here, the end-positionpointer NABPE is pointing, as shown in FIG. 5, to the eighth entry inthe unused block allocation table UBAT.

Next, a description follows of a method for the reutilization of unusedblocks. In the database reorganization system that is a preferredembodiment of the invention as it concerns primary keys, when the needarises to acquire a new block (for example, the addition and acquisitionof the next primary block after the final pointer in the location tablewith the addition of a record, or the addition of an overflow block),rather than acquiring the block from a new area, the unused blockallocation table UBAT is referenced and, if an unused block exists,blocks registered in the unused block allocation table UBAT areprioritized for use. A method of utilizing unused blocks is to use theblock in the entry that the start-position pointer NABPS is pointing to.The unused block allocation table UBAT contains the addresses of unusedblocks, so when a block is added, the address is written to the locationtable (location tables LC and LN in FIG. 2 or FIG. 4), and when anoverflow block is added, the address is written to the primary blockthat manages that block or to the pointer to the overflow block. Thecontent of the start-position pointer NABPS is then rewritten to pointto the next entry in the unused block allocation table. FIG. 5illustrates a state immediately following two executions of suchrewriting and the utilization of two unused blocks.

The unused block allocation table UBAT may be used in cyclical fashion.When an unused block appears, the position of the end-position pointerNABPE moves towards the end of the unused block allocation table UBAT(downwards in the drawing), and when an unused block is utilized, theposition of the start-position pointer NABPS likewise slides towards theend of the unused block allocation table UBAT (downwards in thedrawing), and so the one table may be used cyclically as long as theend-position pointer NABPE does not overtake the start-position pointerNABPS. In other words, in the database reorganization system that is apreferred embodiment of the invention as it concerns primary keys, whenthe end-position pointer NABPE reaches the final position (bottommost inthe drawing) of the unused block allocation table UBAT, the end-positionpointer NABPE is returned to the beginning (uppermost in the drawing) ofthe unused block allocation table UBAT again and the unused blockallocation table UBAT may thus be recycled.

Database Access During Reorganization

Next, a description follows, referencing FIG. 7, of enabling dataretrieval, reading and writing during reorganization in the databasereorganization system that is a preferred embodiment of the presentinvention as it concerns primary keys.

That it is possible to retrieve, write and read data duringreorganization means that it is possible to perform reorganizationwithout shutting down the system and even as the data storage andretrieval system is in operation.

FIG. 7 illustrates data retrieval and read/write operations duringreorganization in the database reorganization system that is anembodiment of the invention as it concerns primary keys.

In this database reorganization system, retrieval with primary keyvalues while reorganization is not running is performed by means ofbinary search using the current location table LC.

In this database reorganization system, if reorganization is running,the target key value (the key value to be retrieved) is assessed as lessthan (upwards in the drawing) or greater than (downwards in the drawing)the primary key value that the reorganization pointer RPLC is pointingto. Since primary keys are listed in the order of their values in thecurrent location table LC, this may be achieved by comparing the keyvalue of the entry that the reorganization pointer RPLC is pointing toand the target key value.

If the target key value is less than the low value of the primary keyvalue in the entry that the reorganization pointer is pointing to, thenthe target entry exists upwards from (in a smaller address than) thereorganization pointer RPLC. In this case the new location table LN isused to perform a binary search between the first address in the newlocation table LN and the reorganization pointer RPLN (in the searchregion 101). As the result of the binary search, the records in theblock that the target entry is pointing to are examined and it isdetermined whether the target record is present or not.

If the target key value is equal to or greater than the low value of theprimary key value of the entry that the current reorganization pointerRPLC0 is pointing to, the target entry exists downwards from the currentreorganization pointer RPLC (the entry RPLC0 is pointing to or in alarger address). In this case the current location table LC is used toperform a binary search on the entries between the currentreorganization pointer RPLC and the final pointer in the currentlocation table LC (in the search region 102).

Thus, in the database reorganization system that is a preferredembodiment of the present invention as it concerns primary keys, thecurrent reorganization pointer RPLC is used to make a comparison withthe target key value and assess it as less than (upwards from in thedrawing) or greater than (downwards in the drawing) the primary keyvalue that the current reorganization pointer RPLC is pointing to, andsince the target entry may then be definitively retrieved by performinga binary search on the location table LN if the target key value is lessthan that primary key value or on the location table LC if greater thanthat primary key value, the record holding the target primary key valuemay be retrieved.

In the database reorganization system that is a preferred embodiment ofthe present invention, a block containing the record that holds thetarget primary target key cannot be accessed while positively underreorganization for reason of exclusion and is queued for release fromexclusion, but this state does not in any way constitute a problem sinceit is no different from the update, insertion or deletion of records innormal access. In other words, requests to excluded blocks are queuedfor their release from exclusion and may be processed oncereorganization of that block is complete and it is released fromexclusion.

Retrieval by Means of Alternate Key

Retrieval by means of alternate key in the database reorganizationsystem that is a preferred embodiment of the present invention as itconcerns primary keys is described with reference to FIGS. 1 through 4.

The foregoing description concerns an instance of retrieval by means ofprimary key during the reorganization of a location table and blocks,and retrieval by means of alternate key is as follows. Retrieval bymeans of alternate key consists of using alternate key tables (referencenumerals 11A, 11B and 11C in FIG. 1) or, as discussed below,alternate-key location tables to retrieve the target alternate-keyentry.

Once the alternate-key entry is found, its content is applied toretrieve the record. As described above, alternate-key entries may haveone of three formats: (i) entries maintaining block numbers, (ii)entries not maintaining block numbers and (iii) entries maintainingblock addresses.

Given format (ii) in which entries do not maintain block numbers, abinary search is performed on the location table with the primary key inentirely like fashion as described above to access a database undergoingreorganization. If the entries maintain block addresses, the block atthat address is known to be the target block because block addresses arenot modified in reorganization. However, if the entries maintain blocknumbers, the number must be identified as one in the current locationtable or one in the new location table, since block numbers are updatedin reorganization.

This identification is performed as follows.

If the block number the current reorganization pointer RPLC is pointingto is greater than the block number the new reorganization pointer RPLNis pointing to, identification is performed as follows. It is notpossible for the object of retrieval to be a block between the blocknumber the current reorganization pointer RPLC is pointing to and theblock number the new reorganization pointer RPLN is pointing to. Thereason is that since reorganization in the current location table LC hasprogressed through the location of the reorganization pointer RPLC andreorganization has not run on the section of the new location table LNbeyond the reorganization pointer RPLN, blocks with block numbersgreater than the block the new reorganization pointer RPLN is pointingto will not be the object of retrieval.

That is, when the object of retrieval is a block number less than theblock number the new reorganization pointer RPLN is pointing to, the newlocation table LN is used. When the object of retrieval is a blocknumber equal to or greater than the reorganization pointer RPLC, thecurrent location table LC is used. If the block number the currentreorganization pointer RPLC is pointing to is less than the block numberthe new reorganization pointer RPLN is pointing to, identification isperformed as follows.

When the block number that is the object of retrieval is less than theblock number the reorganization pointer RPLC is pointing to, the newlocation table LN is used. When the block number that is the object ofretrieval is greater than the block number the reorganization pointerRPLN is pointing to, the current location table LN is used.

When the block number that is the object of retrieval is equal to orgreater than the block number the current reorganization pointer RPLC ispointing to and less than the block number the new reorganizationpointer RPLN is pointing to, the block number alone is insufficient touniquely specify which location table is appropriate and so thefollowing procedures is performed.

In the results of retrieval from an alternate-key table, thealternate-key entry obtained contains a primary key. This primary keyvalue is compared with the primary key value in the entry of thelocation table that the reorganization pointer RPLN is pointing to, andif the primary key value of the alternate-key entry is less than theprimary key value of the reorganization pointer RPLN, the new locationtable LN is used. If the primary key value of the alternate-key entry isgreater than or equal to the primary key value of the reorganizationpointer RPLN, the current location table LC is used.

The description foregoing concerns record retrieval, but its applicationalso permits record insertion, updating and deletion.

First, a description follows of insertion. To insert a record, it mustfirst be determined by means of the primary key value of the record tobe inserted at which position in which block to insert the record. Thisis because the data storage and retrieval system described above has aspecified storage system in which records are stored in blocks in theorder of their primary key values, and the primary key value of a recordstored in a block anterior to another block is less than the primary keyvalues of records stored in that other block.

The insertion position of a record is obtained by finding the block intowhich the record should be inserted and searching within that block,with entirely the same retrieval methods as described above. Here, thepertinent entries in the location table and the block subject toinsertion are placed under exclusion. As described for retrieval, ifthat block is positively undergoing reorganization, it is underexclusion and the exclusion instruction is queued for release fromexclusion, and the block may be placed under exclusion and the followingoperations performed once the reorganization exclusion is lifted.

The description first concerns itself with cases in which no overflowblocks are linked to the block. If sufficient free space to store theinserted record exists in that primary block, records located after theinsertion location are shifted towards the end by the amount of thelength of the record inserted and the inserted record is written to thearea freed up.

If sufficient free space to store the inserted record does not exist inthat primary block, a new overflow block is linked to the primary block,the post-insertion storage volume calculated, sufficient records movedto the overflow block to reserve the suitable initial storage rate inthe primary block and the inserted record stored at the appropriatelocation in the primary block or the overflow block.

If an overflow block is linked to the primary block, the operationdescribed above is performed reckoning the primary block and anyoverflow blocks as a single unit.

Next, alternate keys are added. If the inserted record has alternatekeys in three kinds (A, B and C), alternate-key tables are searched foreach kind. The description following applies to alternate-key table A. Adetailed description of a method for retrieval by means of alternate keyduring reorganization is provided in the discussion on reorganization ofalternate-key tables and so is omitted here. Once the alternate-keyblock where an alternate-key entry is to be inserted is retrieved, thelocation at which the alternate-key entry is inserted inside thealternate-key block is determined. At this point, the alternate-keyblock and, if any alternate-key overflow blocks are linked to thatalternate-key block, those alternate-key overflow blocks are placedunder exclusion. If an alternate-key location table is used, thepertinent entries in that alternate-key location table are also placedunder exclusion. If that alternate-key block is positively undergoingreorganization (the reorganization of alternate-key tables is discussedbelow), it is under exclusion and the exclusion instruction is queuedfor release from exclusion, and the alternate-key block may be placedunder exclusion and the following operations performed once thereorganization exclusion is lifted.

The description initially addresses instances in which there are noalternate-key overflow blocks linked to that alternate-key block.

If sufficient free space to store the inserted alternate-key entryexists in that alternate-key block, alternate-key entries located afterthe insertion location are shifted towards the end by the amount of thelength of the alternate-key entry inserted and the insertedalternate-key entry is written to the area freed up.

If sufficient free space to store the inserted alternate-key entry doesnot exist in that primary block, a new alternate-key overflow block islinked to the alternate-key block, the post-insertion storage volumecalculated, sufficient alternate-key entries moved to the alternate-keyoverflow block to reserve the suitable initial storage rate in thealternate-key block and the inserted alternate-key entry stored at theappropriate location in the alternate-key block or the alternate-keyoverflow block.

If an alternate-key overflow block is linked to that block, theoperation described above is performed reckoning the alternate-key blockand any alternate-key overflow blocks as a single unit.

Likewise, entirely same operations are performed for alternate keys Band C.

Performing the above sequence of operations gives a completed recordinsertion. Exclusion is now lifted on any entries in the location table,blocks, overflow blocks, alternate-key blocks and alternate-key overflowblocks that had been placed under exclusion.

Next, the discussion addresses the updating of records.

Updating a record likewise first requires the retrieval of that record.Retrieval is performed with the retrieval method described above. Oncethe record is thus found, the pertinent entries in the location tableand the pertinent blocks are placed under exclusion. The record may thenbe updated. If the length of the record remains unchanged and noalternate keys are modified, the update is complete and exclusion islifted on the location table and blocks.

If an update results in a change to the length of a record, operationsare similar to those for insertion. If the record is now longer, it isdetermined whether there is sufficient storage space for the additionallength. If there is sufficient storage space, the records after thestorage location are shifted towards the end by the requisite number ofadditional bytes and the storage space occupied by the record prior toupdate is combined with the newly reserved storage space to store theupdated record.

If there is insufficient free space, an overflow block is linked andrecords moved to the overflow block, the procedures here being the sameas those for insertion.

If an alternate-key value has been modified, the alternate-key tablemust be modified. A modified alternate-key value results in the deletionof the old alternate-key value entry and the addition of a newalternate-key value entry.

The addition of an alternate-key entry involves the same procedure asfor the insertion of a record. To delete an alternate-key entry, thealternate-key block affected is retrieved and the alternate-key entry tobe deleted found. When the alternate-key block is found, thealternate-key block and, if any alternate-key overflow blocks are linkedto that alternate-key block, those alternate-key overflow blocks areplaced under exclusion. If an alternate-key location table is used, thepertinent entries in that alternate-key location table are also placedunder exclusion. The entry to be deleted is deleted, and thealternate-key entries after the deleted entry are shifted towards thebeginning of the alternate-key block. If the alternate-key entry to bedeleted is stored in an alternate-key overflow block, alternate-keyentries are moved within the alternate-key overflow block. When deletingan alternate-key entry in an alternate-key block to which analternate-key overflow block is linked, some of the alternate-keyentries in the alternate-key overflow block may be moved to thealternate-key block, but they need not be moved since not moving themwill not cause any operational problems. This applies likewise toinstances in which the alternate-key entry affected exists in analternate-key overflow block.

When multiple alternate keys have been modified, the above operationsare executed on the alternate-key tables for which they are necessary.

Performing the above sequence of operations completes record updatingoperations, and so any exclusion applied is lifted.

Next, the description addresses the deletion of records. The deletion ofa record also first requires the retrieval of the record to be deleted.Once the record is found, the affected entries in the location table andthe affected blocks are placed under exclusion.

Next, the record to be deleted is deleted, and the records after thatrecord are moved towards the beginning by the amount of space that hadbeen occupied by the deleted record. When deleting a record in a primaryblock to which an overflow block is linked, some of the records in thelinked overflow block may be moved to the primary block, but they neednot be moved since not moving them will not cause any operationalproblems. This applies likewise to instances in which the recordaffected exists in an overflow block.

In this way, records may be retrieved, inserted, updated and deletedwhile reorganization is underway. Since the addition of a record is avariation on insertion, an addition may be performed in the same fashionas an insertion.

Handling Advances in Reorganization During Retrieval Operations

FIG. 8 illustrates operation when reorganization advances during aretrieval operation in the database reorganization system that is anembodiment of the invention as it concerns primary keys.

It has been explained how it is possible, in the database reorganizationsystem that is a preferred embodiment of the present invention as itconcerns primary keys, to call records during reorganization by usingthe reorganization pointer to make comparisons with the target key valueand deciding whether to use the current location table LC or the newlocation table LN.

However, as shown in FIG. 8, if reorganization advances during aretrieval by means of a primary key using the current location table LC,the position of the reorganization pointer RPLC at the end of theretrieval (S32 in FIG. 8) is different from the position of thereorganization pointer RPLC at the start of the retrieval (S31 in FIG.8) and the blocks in that range had been subject to the search, there isa possibility that records that actually exist may no longer existbecause overflow blocks are already delinked. In FIG. 8 an overflowblock 13 (block number 5-2) has been delinked from block number 5 in theprimary blocks 12. Left as is, this state leads to unstable operationand introduces the inconvenience of unuseability.

Given this inconvenience, retrievals may be performed without problem byimplementing the following measures. The target key value andreorganization pointer are used to determine whether the location tablesubjected to the search is the current location table LC or the newlocation table LN. If the location table used is the new location tableLN, no problem arises even if the reorganization pointer RPLN hasadvanced from when the search started. Problems arise when it is thecurrent location table LC that is used. When the current location tableLC is searched, unless measures are implemented problems will arise ifthe reorganization pointer RPLN advances from when the search started.

Access Methods When Reorganization Advances

If reorganization is underway, the value of the current reorganizationpointer RPLC and the value of the new reorganization pointer RPLN aresaved to specific areas in memory before initiating retrieval. These areS-RPLC and S-RPLN. Additionally, the value of the reorganization pointerRPLN at the point the search of the new location table LN is completedis saved as E-RPLC.

At the point the search in the current location table LC is completed,the value of the reorganization pointer RPLC (which is termed “E-RPLC”)is compared with the value of S-RPLC. If these values are different,this means that reorganization advanced during the search. In this case,it is determined where the block that is the object of retrieval is. Ifthe determination finds that the block is between S-RPLC and E-RPLC, itis possible, as discussed above, that the record cannot be retrieved. Inthis case, the new location table LN is used to perform a binary searchbetween S-RPLN and E-RPLN. Here, if the record can be detected, therecord is reckoned to exist, and if it cannot be detected, the record isreckoned not to exist. This permits records to be definitivelyretrieved, and so the phenomenon of the non-existence of a record existAs described above, records may be read even during reorganization.

Record Insertion and Updating

The insertion of a record, as discussed above with respect to calling arecord, requires, in order to determine the block into which the recordwill be inserted, performing a binary search to find the block and theninserting the record into that block, which consists of the sameoperations as calling a record.

And since data updating is performing in the order of once reading arecord and then updating and storing it, it applies the processdescribed above for calling a record, but the following applies when aprimary key value is modified.

When a primary key value in a record is modified, the location in whichthe record is stored must change. The reason is that records are storedin blocks in the order of their primary keys and if a record existsoutside that range, it cannot be retrieved with the location table.Therefore, when a primary key value is modified, the current record isdeleted and a new record is then inserted in the block identified on thebasis of the primary key value as that where it should be stored. Thisis a method that has been in general use in conventional databases.

Otherwise, blocks may be found and written with the same methods as areused for calling records.

Suspending and Resuming Reorganization

It is possible, as described above, to retrieve (read), update, insert,add and delete records by means of primary keys even while the locationtable is under reorganization. In short, it goes without saying thatrecords may be accessed however far reorganization has advanced or, inother words, from any entry in the location table.

Suspending and Resuming Reorganization

It follows that record access may be executed even if reorganization istemporarily suspended. Suspension is, of course, effected afterreorganization has completed on some given selection of blocks.

In the database reorganization system that is a preferred embodiment ofthe present invention as it concerns primary keys, reorganization may beresumed with the entry in the current location table LC and the newlocation table LN indicated by the current reorganization pointer RPLCand the new reorganization pointer RPLN at the point reorganization wassuspended.

Since these functions permit reorganization to be suspended andresources allocated to data processing when the load on the primarysystem increases and then resumed when the load of data processingfalls, there is no need to make advance forecasts of the load on theprimary system and operating conditions and reserve a fixed period oftime for reorganization in advance.

Overflow Block Formats

The drawings provide an example of an overflow block format in the datastorage and retrieval system described above. According to this format,an overflow block does not maintain the low value and high value of theprimary key values of the records in that block. The example in thedrawings is one in which, in addition to the low value and high value ofthe primary keys values of records in primary blocks, the low value andhigh value of the primary key values of records in overflow blocks aremaintained in primary blocks.

Overflow Block Formats

The following formats have been devised as alternatives. Primary blocksmaintain the low value and high value of the primary keys of the recordsin the primary block. Overflow blocks likewise maintain the low valueand high value of the primary keys of the records in the overflow block.And entries in the location table maintain either one or both of the lowvalue and high value of the primary keys of records stored in theprimary blocks managed by those entries and in all of the overflowblocks managed by those primary blocks.

Where overflow blocks do not maintain the low value and high value ofprimary keys, that much more space may be allocated to the storage ofrecords. However, when overflow blocks are liquidated and made intoprimary blocks in reorganization, they will then maintain the low valueand high value of the primary keys and so if one is full with storedrecords, it will lack space to maintain the low value and high value ofprimary keys and an overflow block must be created. And when largenumbers of overflow blocks are linked to a single primary block, theproblem arises that records in the overflow blocks must be readsequentially in order to find a target record, thus adding to retrievaltime. The beneficial effects detailed below may be obtained bymaintaining in primary blocks the low value and high value of theprimary key values of records in those primary blocks and likewisemaintaining in overflow blocks the low value and high value of theprimary key values of records in those overflow blocks.

New Overflow Block Formats

As described above, according to the database reorganization system thatis a preferred embodiment of the present invention as it concernsprimary keys, changing an overflow block to a primary block inreorganization does not result in a new overflow block due to themaintenance introduced of the low value and high value of primary keys,since there are no modifications of block format.

The database reorganization system that is a preferred embodiment of thepresent invention as it concerns primary keys additionally addresses theproblem that records in the overflow blocks must be read sequentially inorder to find a target record, thus adding to retrieval time when largenumbers overflow blocks are linked to a single primary block, with thecapability of reading the low values and high values of primary keys inprimary blocks and, if the target key value falls within that range,evaluating whether the target record may exist among the primary blocks.It is known that if the target key value exceeds the high value, thisindicates that the target record may exist in the overflow block; thatif the first overflow block is read, the target key value is comparedwith the low value and the high value of the primary key values in thatblock, and the target record may exist within the block if the targetkey value falls within that range; and that if the target key valueexceeds the high value, the target record may exist in one of thesubsequent blocks.

Thus, according to the database reorganization system that is apreferred embodiment of the present invention as it concerns primarykeys, it is possible to reduce target-record search times by widemargins when large numbers of overflow blocks are linked.

The above describes the primary system as existing on a single server,but since the primary system is a logical construct, it may exist onmultiple servers.

Reorganization Rewriting Blocks

The above description concerns reorganization of location tables, blocksand overflow blocks. This implementation of reorganization has thebenefits of holding the rewriting of blocks and overflow blocks to aminimum and abbreviating reorganization times by rewriting the currentlocation table to a new location table. However, blocks must berewritten in order to change the size of blocks or in order to changethe block storage medium. Application of the system described aboveallows ready execution of reorganization while thus rewriting blocks.The description makes reference to FIG. 19.

FIG. 19 illustrates reorganization with respect to the elimination ofoverflow blocks, but the elimination of fragmentation and thereservation of suitable initial storage rates may be implemented inentirely like fashion by applying the methods described in the inventionas it concerns primary keys.

The top part of FIG. 19 depicts a current database. The bottom partdepicts a new database. First, a new location table LN is createdcorresponding to the current location table LC. Next, the first LC entryis read and the current block 0 is transferred to the block 0 in the newdatabase. Then the LC entry 0 is transferred to LN. The reason thisorder is applied is that the address of the block 0 in the new databaseis determined after the transfer, but the order may be reversed if theaddress is determined in advance.

Next, the LC entry 1 is transferred to LN, but since two overflow blocksare linked to the entry 1, the LC entry 1 becomes the three entries 1, 2and 3 in LN. After the current blocks 1, 1-2 and 1-3 are transferred tothe blocks 1, 2 and 3 in the new database, the LN entries 1, 2 and 3 arecreated.

FIG. 19 depicts the point at which processing has completed throughblocks 2 and 2-2 in the current database. The chained-and-dotted linesin the drawing indicate the action of transfer. As indicated for theinvention as it concerns primary keys, one reorganization pointer eachis created and provided to LC and LN. The numerals assigned to the LNentries are block numbers, and the numerals in parentheses are theoriginal block numbers in the current database.

In order to simplify the description, the blocks here are of the samesize in the current and new databases, but the block size may also bechanged. Reorganization is completed at the point when the final pointerof the current location table is pointing to the same address as thereorganization pointer. In FIG. 19 the six current blocks 0, 1, 1-2,1-3, 2 and 2-2 are left in existence, but they may be deleted aftertransfer to the new database.

Also in entirely like fashion as described for the invention as itconcerns primary keys, database access is to the new database when thetarget key value is less than the reorganization pointer and to thecurrent database when the target key value is greater than or equal tothe reorganization pointer. The suspension and resumption ofreorganization is also in entirely like fashion as for the invention asit concerns primary keys.

The current and new databases are depicted here as present on the samemachine, but they may also be present on different machines.

Reorganization of Alternate-Key Tables

Alternate-Key Table Formats

In the data storage and retrieval system, alternate key tables were of aformat without location tables. A more advantageous format has beendevised for the invention as it concerns alternate keys. In the datastorage and retrieval system proposed by the inventors, alternate-keyblocks have a format maintaining the low values and high values of thealternate-key values of the entries contained in that block and the lowvalues and high values of the alternate-key values of the entriescontained in alternate-key overflow blocks linked to that block.

The new format for alternate-key tables is one that employs locationtables for alternate-key tables as well.

Reorganization of alternate-key tables lacking alternate-key locationtables may be implemented by creating new alternate-key blocks forcurrent alternate-key blocks and sequentially transferring from thecurrent alternate-key blocks to the new alternate-key blocks. In likefashion as for location tables and blocks, reorganization pointers areused here as well, alternate-key overflow blocks are made intoalternate-key blocks and the elimination of fragmentation and thereservation of suitable initial storage rates are performed.

Objectives of Reorganization of Alternate-Key Tables

The objectives are the same as those of the reorganization of locationtables and blocks. The three objectives are the elimination ofalternate-key overflow blocks, the elimination of fragmentation and thereservation of suitable initial storage rates. Since each of theseobjectives is discussed in detail for the reorganization of locationtables and blocks, detailed individual descriptions of them are omittedhere.

New Alternate-Key Table Format

Addition of Alternate-Key Location Tables

A description follows, with reference to FIGS. 9 and subsequent, of thedatabase reorganization system that is a preferred embodiment of thepresent invention as it concerns alternate keys.

FIG. 9 illustrates the automatic database reorganization system that isa preferred embodiment of the present invention as it concerns alternatekeys.

The database reorganization system that is a preferred embodiment of thepresent invention in FIG. 9 employs a new format for alternate-keytables 11A, 11B and 11C, which is to use location tables for thealternate-key tables 11A, 11B and 11C. That is, the databasereorganization system that is a preferred embodiment of the presentinvention as it concerns alternate keys uses an alternate-key locationtable AALC to the purpose of managing alternate-key blocks 17 of thealternate-key location tables 11A, 11B and 11C. This alternate-keylocation table AALC has the same functionality as the location table forblocks described in the preferred embodiment of the present invention asit concerns primary keys.

In FIG. 11 alternate-key table 11A is comprised of an alternate-keylocation table AALC and alternate-key blocks 17, alternate-key table 11Bis comprised of an alternate-key table AALC and alternate-key blocks 17,and alternate-key table 11C is also comprised of an alternate-key tableAALC and alternate-key blocks 17. Therefore, the alternate-key table 11Ais held, in this discussion of this database reorganization system thatis a second preferred embodiment of the present invention, to stand forthe others, and discussion of the alternate-key tables 11B and 11C isomitted.

FIG. 10 illustrates an alternate-key table in a primary system in thedatabase reorganization system that is an embodiment of the invention asit concerns alternate keys.

In FIG. 10, the entries in alternate-key location table AALC(alternate-key location table entries) in primary system 1 maintain theaddresses of alternate-key blocks AAC that those entries manage.Additionally and as needed, they maintain the numbers of thealternate-key blocks AAC and either or both of the low values and highvalues of the alternate-key values of the alternate-key entries storedin those alternate-key blocks AAC and in alternate-key overflow blocks15 and 16 linked to those alternate-key blocks AAC. The alternate-keyoverflow blocks 15 and 16 are managed by the alternate-key block AAC andso are not managed by the alternate-key location table AALC.

The maintenance of the low values and high values of the alternate-keyvalues of the alternate-key entries stored in the alternate-key blocksAAC and the alternate-key overflow blocks 15 and 16 is likewise to thatof the first format in the present invention as it concerns alternatekeys. This alleviates the load when retrieving a target alternate-keyentry when multiple alternate-key blocks 15 and 16 are linked to thealternate-key blocks AAC.

The utilization of such a format allows the maintenance in alternate-keylocation tables AALC of the low values and/or high values of thealternate-key values of the alternate-key blocks 17 and thealternate-key overflow blocks 15 and 16 linked to the alternate-keyblocks 17. It is thus no longer necessary for the alternate-key blocksAAC to maintain the low values and high values of the alternate-keyentries stored in themselves and in the alternate-key overflow blocks 15and 16 linked to themselves, and it allows an identical format to beused for both the alternate-key blocks 17 and the alternate-key overflowblocks 15 and 16.

Advantages in the Use of Alternate-Key Location Tables

The following advantages may be expected of the use of the formatdescribed above in the database reorganization system that is a secondpreferred embodiment of the present invention.

First, there is no need to change the format of alternate-key blocks oralternate-key overflow blocks in reorganization, allowing alleviation ofthe load during reorganization.

Second, the transfer of alternate-key blocks and alternate-key overflowblocks is held to a minimum.

Third, whereas the format used in the data storage and retrieval systemrequired that a contiguous region be acquired for alternate-key tables,when this format is adopted and alternate-key location tables are used,no inconvenience arises from having alternate-key blocks scattered indiscontiguous regions if a contiguous region is acquired for thealternate-key location table, and so region acquisition may be performedwith flexibility.

On the other hand, a disadvantage of this system is the routine need forextra space for the region of the alternate-key location table.

Reorganization of Alternate-Key Tables

Next, a description follows, making reference to FIG. 11, of methods ofreorganization employing the new format for alternate-key tables in adata and database reorganization system that is the preferred embodimentof the present invention as it concerns alternate keys.

FIG. 11 here illustrates methods of reorganization in the databasereorganization system that is the preferred embodiment of the inventionas it concerns alternate keys.

In FIG. 11, the second implementation of alternate-key tables in thepreferred embodiment of the present invention as it concerns alternatekeys consists of providing an alternate-key location table AALC toalternate-key blocks 17 of an alternate-key table 11A, in which theentries of the alternate-key location table AALC manage thealternate-key blocks 17. Alternate-key overflow blocks 15 and 16 havealmost the same relationship as that between an alternate-key blocktable and blocks.

Two Alternate-Key Location Tables and Two Reorganization Pointers:Transferring Alternate-Key Location Table Entries

Thus, reorganization in this case may be executed according to much thesame logic as described above for the reorganization of a location tableand blocks.

Reorganization of Alternate-Key Tables: Region Required forReorganization

Two Location Tables

To perform automatic reorganization with this automatic reorganizationsystem requires one pair of the alternate-key location table subjectedto reorganization.

Reorganization of Alternate-Key Tables: Elimination of Overflow

The description following concerns reorganization of the alternate-keytable 11A in FIG. 9. As explained above, description concerning thealternate-key tables 11B and 11C is omitted, but their reorganization isentirely likewise that of the alternate-key table 11 A.

In FIG. 9, FIG. 10 and FIG. 11, reorganization utilizing alternate-keylocation tables AALC and AALN is described below. It is only thealternate-key blocks 17 that are managed by alternate-key location tableAALC. Alternate-key overflow blocks 15 and 16 are managed byalternate-key blocks 17. In other words, alternate-key blocks 17maintain the addresses of the alternate-key overflow blocks 15.Therefore, when using the alternate-key location table AALC to retrievedata with an alternate key, the target alternate-key block 17 is foundby performing a binary search on the alternate-key location table AALC,but it is further necessary to find the target alternate-key entrywithin that alternate-key block 17. If multiple alternate-key overflowblocks 15 and 16 are linked to the alternate-key block 17, it will takemore time, for the alternate-key overflow blocks 15 and 16, to find thetarget alternate-key entry than when only an alternate-key block 17exists. The time required to find a target alternate-key entry may bereduced by doing away with the alternate-key overflow blocks 15 and 16,in order to avoid this, and managing all the alternate-key blocks in thealternate-key location table AALC.

Further, alternate-key entries are held in the alternate-key blocks 17and the alternate-key overflow blocks 15 and 16 in the order of theiralternate keys, and if there are many alternate-key overflow blocks,more shifting of alternate-key entries results when an alternate-keyentry is inserted and efficiency falls during insertion, the preventionof which is a further objective.

Using Two Location Tables, One Current and One New, to PerformReorganization

Since a count is kept of how many alternate-key overflow blocks 15 and16 are generated, the number of entries in the new location table AALNis the sum of the number of entries in the alternate-key location tableAALC and the number of overflow blocks 15 and 16. In order to contrastthe location table in use at the time of reorganization with AALN, itshall be referred to as AALC. Since the number of entries may increaseduring reorganization and the number of alternate-key entries willincrease with the addition of records after reorganization, a numbergreater than that necessary should be reserved.

However, since, as described below, the number of alternate-key blocksrequired may conversely decrease with the elimination of fragmentationand the number of alternate-key blocks required may vary with thereservation of suitable initial storage rates, the most preferablemethod is to calculate this figure from the volume of entries stored andthe suitable initial storage rate.

A contiguous region sufficient to the volume of the new alternate-keylocation table AALN is secured in the primary system.

Once the region is secured, the entries are sequentially transferredfrom the current alternate-key location table AALC0 to the newalternate-key location table AALN, which procedure is performed asdescribed below.

Reorganization Pointers

First, reorganization pointers are created. These indicate through whichentry in the alternate-key location tables transfer has been completed,and so two are provided: a current reorganization pointer RPAALC for thecurrent alternate-key location table AALC and a new reorganizationpointer RPAALN for the new alternate-key location table AALN. Theinitial value of the current reorganization pointer RPAALC is the firstaddress in the current alternate-key location table AALC, and theinitial value of the new reorganization pointer RPAALN is the firstaddress in the new alternate-key location table AALN.

Next, the first entry of the current reorganization pointer RPAALC0 andthe alternate-key block 17 and alternate-key overflow blocks managed bythat entry are placed under exclusion. Overflow blocks do not exist inthis case, and so only the alternate-key block 17 (alternate-key blocknumber 0) is affected.

Next, the first entry (block number 0) is transferred (S51 in FIG. 11)from the current alternate-key location table AALC to the newalternate-key location table AALN. When doing so, a check is madewhether any overflow blocks are linked to the block managed by the firstentry. If not, the addresses of the current reorganization pointerRPAALC and the new reorganization pointer RPAALN are modified to pointto the second entry.

In FIG. 11, there are no alternate-key overflow blocks linked and so thecurrent reorganization pointer RPAALC is modified to point to the secondentry in the current alternate-key location table AALC. The newreorganization pointer RPAALN is likewise modified to point to thesecond entry in the new alternate-key location table AALN. Next, if thefirst entry had been placed under exclusion, exclusion is now lifted onit.

Next, the second entry in the current alternate-key location table AALCis placed under exclusion. Since two alternate-key overflow blocks 15and 16 are linked to the second entry, the alternate key block 17 andthe alternate-key overflow blocks are placed under exclusion.

Next, the second entry (block number 1) is processed. Two alternate-keyoverflow blocks are linked to the alternate-key block managed by thesecond entry in AALC0. The following applies when alternate-key overflowblocks are linked. The second entry in the alternate-key location tableAALC is transferred to the second entry in the new alternate-keylocation table AALN, and the low value and high value of thealternate-key value of the entry stored in that block are modified.

This is because when the current alternate-key location table AALC entryholds the low value and high value of the alternate-key value of thealternate-key block 17, the low value and high value of thealternate-key value of the alternate-key location table AALC will notmatch the low value and high value of the alternate-key value of theentry stored in the alternate-key block 17 (alternate-key block number1) is they are merely transferred and is done in order to avoid this.

Assume a low value of 0000 and a high value of 0299 for thealternate-key value of the second entry in the new alternate-keylocation table AALN. Further assume a low value of 0000 and a high value0099 for the alternate-key value of the entry stored in thealternate-key block 17 (alternate-key block number 1), a low value of0100 and a high value of 0199 for the alternate-key value of the entrystored in the first alternate-key overflow block 15, and a low value of0200 and a high value of 0299 for the alternate-key value of the entrystored in the second alternate-key overflow block 16.

The alternate-key value of the second entry in the new alternate-keylocation table AALN will have a low value of 0000 and a high value of0099 (S52 in FIG. 11). The address of the alternate-key block maintainedin the entry in the new alternate-key location table AALN will take thesame value as the address of the alternate-key block 17 (alternate-keyblock number 1).

Next, the third entry in the new alternate-key location table AALN takes(S53 in FIG. 11) the address and the low value of 0100 and the highvalue of 0199 of the alternate-key value of the first alternate-keyoverflow block 15. The fourth entry in the new alternate-key locationtable takes (S54 in FIG. 11) the address of the second alternate-keyoverflow block 16 and the low value of 0200 and the high value of 0299of the alternate-key value.

Next, the alternate-key overflow block address in the alternate-keyblock 17 (alternate-key block number 1) is reset and the alternate-keyoverflow block 15 delinked (S55 in FIG. 11). Next, the alternate-keyoverflow block address in alternate-key overflow block 15 is reset andthe second alternate-key overflow block 16 delinked (S56 in FIG. 11).

Once reorganization is completed for a given entry in the newalternate-key location table AALN, exclusion is lifted on the blockmanaged by that entry.

This allows reorganization to be performed without rewritingalternate-key overflow blocks to a separate location.

Thereafter, the third entry (alternate-key block number 2) and thefourth entry (alternate-key block number 3) in the current alternate-keylocation table AALC are likewise reorganized (S60 in FIG. 11), andreorganization thus proceeds sequentially.

FIG. 11 depicts reorganization completed through the fourth entry in thecurrent alternate-key location table AALC.

Next, the value of the new reorganization pointer RPAALN is rewritten(S61 in FIG. 11) to point to the beginning of the fourth entry in thenew alternate-key location table AALN.

Reorganization of Alternate-Key Tables: Exceptions to Elimination ofOverflow

The elimination of alternate-key overflow blocks has been discussed; thediscussion below concerns instances in which overflow cannot beeliminated. Instances in which overflow cannot be eliminated are thosein which many entries have the same alternate key and these cannot bestored in a single block. Entries having the same alternate key are notstored in multiple alternate-key blocks, but are always stored in asingle alternate-key block and one or more alternate-key overflow blocksor in multiple alternate-key overflow blocks. Storage in multiplealternate-key overflow blocks occurs when entries are already stored inan alternate-key block and the alternate-key overflow block immediatelysubsequent and storage of the entries with the identical alternate keycommences midway through the alternate-key overflow block.

Where entries with an identical alternate key are stored, maintainingthat information in the alternate-key block and the alternate-keyoverflow blocks facilitates understanding. Information specifyingwhether it is the beginning, the middle or the end of the entries withthe identical alternate key is maintained. Otherwise, it does not differfrom the storage of regular entries.

In such a case, elimination may be performed on those sections ofalternate-key overflow blocks that have regular entries, but sincealternate-key blocks and alternate-key overflow blocks storing entrieswith an identical alternate key are necessarily of a structure entailingalternate-key overflow blocks, these alternate-key overflow blocks maynot be eliminated. In such cases, information relating to the entrieswith the identical alternate key should be output as post-reorganizationinformation.

Reorganization of Alternate-Key Tables of the Second Format: Eliminationof Fragmentation

The description foregoing concerns methods of eliminating alternate-keyoverflow blocks; like overflow, fragmentation presents significantproblems in terms of efficiency.

Fragmentation may be eliminated by means of operations similar to thosefor reorganization to eliminate alternate-key overflow blocks. Thediscussion makes reference to FIG. 12. FIG. 12 here illustrates a methodfor the elimination of fragmentation in the database reorganizationsystem that is a third preferred embodiment of the present invention.

Alternate-key location tables are again used in FIG. 12, a newalternate-key location table AALN in addition to a current alternate-keylocation table AALC.

In FIG. 12, entries are moderately stored in alternate-key blocks 0, 1,1-2, 1-3, 2 and 2-2 in alternate-key blocks 17, and the elimination ofalternate-key overflow blocks has been completed according to the methoddescribed above.

The block numbers used in the description below are the block numbers inthe current alternate-key location table AALC.

Where numbers from new alternate-key location table AALN are used, it isso stated. In the alternate-key blocks 17, entries are stored inalternate-key block number 3 up to 30% of its storage capacity. Entriesare stored in alternate-key block number 4 up to 40% of its storagecapacity. Entries are stored in alternate-key block number 5 up to 70%of its storage capacity, in alternate-key block number 5-2 of thealternate-key overflow blocks 15 up to 60% of its storage capacity andin alternate-key block number 6 of the alternate-key blocks 17 up to 70%of its storage capacity. The utilization rate of each block afterreorganization is 90%. This is to prevent the generation ofalternate-key overflow blocks immediately upon the insertion of recordsafter reorganization.

The reorganization system that is a preferred embodiment of the presentinvention as it concerns alternate keys is about to perform thereorganization of alternate-key block number 3 of the alternate keyblocks 17 that the new reorganization pointer RPAALN of the newalternate-key location table is pointing to. However, its storage rate(the volume of records stored in that block as a proportion of thecapacity of the block) is 30%, which does not satisfy the suitableinitial storage rate. Therefore, attention turns to the nextalternate-key block number 4 of the alternate-key blocks 17. Since thestorage rate of this alternate-key block number 4 is 40%, adding the twoalternate-key blocks together still falls short of the suitable initialstorage rate (90%). Attention then turns to alternate-key block number 5of the alternate-key blocks 17, which has a storage rate of 70% andwould thus exceed the suitable initial storage rate of 90%. Leaving therecords stored in block number 3 of the alternate-key blocks 17untouched, the records in alternate-key block number 4 of thealternate-key blocks 17 are moved to alternate-key block number 3 of thealternate-key blocks 17. This gives alternate-key block number 3 of thealternate-key blocks 17 a storage rate of 70%. In order to achieve thesuitable initial storage rate, the first 20% of the entries stored inalternate-key block number 5 of the alternate-key blocks 17 is moved toalternate-key block number 3 of the alternate-key blocks 17, and theremaining 50% of the entries is shifted to the beginning of that block(to the left in the drawing). Alternate-key block number 3 of thealternate-key blocks 17 is now done, and so the address of alternate-keyblock number 6 of the alternate-key blocks 17 of the new reorganizationpointer RPAALN is rewritten to alternate-key block number 3 of thealternate-key blocks 17. The new reorganization pointer RPAALN is movedto the beginning of alternate-key block number 7 of the alternate-keyblocks 17 in the new alternate-key location table AALN. At this point,alternate-key block number 4 of the alternate-key blocks 17 becomes anunused alternate-key block.

Next, operations are performed on alternate-key block number 5 of thealternate-key blocks 17, and alternate-key block number 5 of thealternate-key blocks 17 has a storage rate resulting from reorganizationof 50%. Since alternate-key block number 5-2 of the alternate-keyoverflow blocks 15 has a storage rate of 60%, the first 30% of theentries in alternate-key block number 5-2 is moved to alternate-keyblock number 5 of the alternate-key blocks 17, and the remaining entriesin alternate-key block number 5-2 are at the same time shifted to thebeginning of that block (to the left in the drawing). The link betweenalternate-key block number 5 of the alternate-key blocks 17 andalternate-key block number 5-2 of the alternate-key overflow blocks 15is then cut the link. This sets the alternate-key overflow block addressof alternate-key block number 5 of the alternate-key blocks 17 to aspecific value (for example, zero). Alternate-key block number 5 of thealternate-key blocks 17 is now done, and so the address of alternate-keyblock number 7 of the alternate-key blocks 17 in the new alternate-keylocation table AALN is rewritten to alternate-key block number 5 of thealternate-key blocks 17.

Next, since the alternate-key block 5-2 of the alternate-key overflowblocks 15 has a storage rate of 30% and the next alternate-key blocknumber 6 of the alternate-key blocks 17 has a storage rate of 60%, allof the entries in alternate-key block number 6 of the alternate-keyblocks 17 are moved to alternate-key block number 5-2 of thealternate-key blocks 17. Alternate-key block number 5-2 of thealternate-key blocks 17 is now done, and so the address of alternate-keynumber 8 of the alternate-key blocks 17 in the new alternate-keylocation table AALN is rewritten to alternate-key block number 5-2 ofthe alternate-key blocks 17. The new reorganization pointer RPAALN ismoved to the beginning of alternate-key block number 9 in the newalternate-key location table AALN. At this point, alternate-key blocknumber 6 of the alternate-key blocks 17 becomes an unused alternate-keyblock.

Reorganization of Alternate-Key Tables: Reservation of Suitable InitialStorage Rates

The reservation of a suitable initial storage rate may require theaddition of alternate-key blocks, the converse of handlingfragmentation. What this means is that, for example, given a spaceutilization rate of 100% in all alternate-key blocks, achieving asuitable initial storage rate of 90% entails the addition of onealternate-key block to the nine alternate-key blocks that may exist atthe point of reorganization and storing entries in each of the tenalternate-key blocks at a storage rate of 90%.

It may not be possible in some cases, for reasons of alternate-key blockand entry size, to strictly implement reservation of suitable storagerates, and since it may be necessary to subject considerably largenumbers of alternate-key blocks to reorganization at particular pointsin time, the requisite exclusion range may expand accordingly and havean adverse effect on system operation.

In order to prevent such circumstances arising, it is preferable inoperational terms to use multiple values, such as 85% to 90%, for thesuitable initial storage rate. In this case, the suitable initialstorage rate need only fall within the range of 85% to 90%.

When reserving suitable initial storage rates, entries may be rewrittenfrom their original alternate-key blocks to other alternate-key blocks.

Reorganization of Alternate-Key Tables: Reorganization in Practice andPrevention of Deadlock

As the execution of reorganization in practice consists of at onceperforming the elimination of alternate-key overflow blocks, theelimination of fragmentation and the reservation of suitable initialstorage rates, it is a combination of these three.

In principle, a reorganization system may be implemented as describedabove, but as alternate-key blocks are sequentially read andalternate-key blocks created with suitable initial storage rates, therisk of deadlock increases since the exclusion range cannot bedetermined from the outset and exclusion extends sequentially. Effectiveways of preventing deadlock are as follows.

The first method of preventing deadlock is to read the alternate-keyblocks in sequence without placing them under exclusion, find theirrecord storage rates and calculate the appropriate size for combinationsof multiple blocks in order to determine the exclusion range.

Adding Elements to Location Table Entries

The second method of preventing deadlock is to provide alternate-keylocation table entries capable of maintaining, in addition to blocknumbers and block addresses, either or both of the low value and highvalue of the primary key values of records stored in blocks and,additionally, the storage rates of records or the number of bytesoccupied by records in blocks. This eliminates the need to read blocks,as described for the first method above, and allows that information tobe gained simply by reading the alternate-key location table. However,since the need to rewrite the alternate-key location table then ariseswith the insertion and deletion of records, a choice should be madebetween the first and the second method of preventing deadlock describedabove depending on the state of record generation.

Thus, the reorganization range is determined upon finding the storagerates within alternate-key blocks and alternate-key overflow blocks, andthe alternate-key location table entries within that range, thealternate-key blocks that those entries point to and the alternate-keyoverflow blocks linked to those alternate-key blocks are placed underexclusion.

Then the number of alternate-key blocks and alternate-key overflowblocks subject to reorganization and their storage capacity are foundand the actual volume of the entries requiring storage is found, andthen it is assessed whether the number of alternate-key blocks requiredis equal to, greater than or less than the sum of the current number ofalternate-key blocks and alternate-key overflow blocks. Then applyingthe logic described above for the elimination of alternate-key overflowblocks, the elimination of fragmentation and the reservation of suitableinitial storage rates, the alternate-key blocks are reorganized andentries created in the new alternate-key location table.

In this case, the reorganization pointers move at once for the amount ofblocks subject to reorganization.

Reorganization is thus performed on from one to several tens ofalternate-key location table entries and alternate-key blocks in asingle pass, but this reorganization does not interfere with regulardata processing since it appears to be treated as a regular dataprocessing transaction.

And by performing successive passes of this reorganization on thealternate-key location table and the alternate-key blocks,reorganization of the whole is completed.

Reorganization of Alternate-Key Tables: Completion of Reorganization

Description follows of methods of recognizing the completion ofreorganization of an alternate-key location table. An alternate-keyfinal pointer is provided to the current alternate-key location tableAALC to indicate the final position used by that alternate-key locationtable AALC. The final pointer is provided for the following purpose. Thealternate-key location table AALC is reserved in a contiguous area. Itis possible to provide an additional alternate-key location table in anarea discontiguous with the first alternate-key location table whenalternate-key location table entries are insufficient and convertaddresses to perform binary searches as though their areas werecontiguous, but this is not recommended because the load of addressconversion increases with large numbers of discontiguous areas.Therefore, a fully adequate area for alternate-key location tables isreserved from the outset and the address of the next entry after theentries used is pointed to in order to distinguish used entries fromunused entries.

Thus, the provision of the alternate-key final pointer allows theretrieval of records by means of primary keys even if unused entriesexist in the current alternate-key location table AALC by means ofexecuting a binary search between the first address and thealternate-key final pointer in the current alternate-key location tableAALC.

Methods of Detecting the Completion of Reorganization

The alternate-key final pointer is used as an indicator. Reorganizationof the alternate-key location table AALC and the alternate-key blocks 17is completed when reorganization runs and the address pointed to by thecurrent reorganization pointer RPAALC matches the address pointed to bythe alternate-key final pointer. Alternate-key overflow blocks linked tothe alternate-key block 17 that the final entry points to do notrepresent a problem because the current alternate-key location tableAALC does not move until the reorganization of these alternate-keyoverflow blocks is complete.

Reorganization of Alternate-Key Tables: Reutilization of Unused Blocks

A method of reorganization has thus been described that addressesfragmentation, but in FIG. 12 alternate-key block 4 of the alternate-keyblocks 17 and alternate-key block 6 of the alternate-key blocks 17 ofthe current alternate-key location table AALC are left unused. Left asit is, this may result in failing to eliminate overall fragmentationwhile eliminating fragmentation within alternate-key blocks. Thefollowing approach is adopted in order to prevent this outcome.

Unused Block Allocation Table, Start-Position Pointer and End-PositionPointer

FIG. 13 assists in the description of a method for eliminating overallfragmentation in the database reorganization system that is the thirdpreferred embodiment of the invention as it concerns alternate keys.

An unused alternate-key block allocation table UABAT is used in thepreferred embodiment of FIG. 13. The unused alternate-key blockallocation table UABAT is a table of the format shown in FIG. 13, andits purpose is to store the addresses of unused alternate-key blocks.Two pointers are also used, a start-position pointer NAAABPS to indicatethe start position and an end-position pointer NAAABPE to indicate theend position in the the unused alternate-key block allocation tableUABAT. FIG. 13 represents a state in which seven unused alternate-keyblocks have appeared where none at all previously existed.

In their initial states, both the start-position pointer NAAABPS and theend-position pointer NAAABPE point to the beginning of the unusedalternate-key block allocation table UABAT. When an unused alternate-keyblock (one labeled “unused” in FIG. 13) appears, the address of theunused alternate-key block is registered in the entry that theend-position pointer NAAABPE of the unused alternate-key blockallocation table UABAT is pointing to and the end-position pointerNAAABPE is rewritten to point to the next entry in the unusedalternate-key block allocation table UABAT. FIG. 13 illustrates thestate after the sequential appearance of seven unused alternate-keyblocks. Here, the end-position pointer NAAABPE is pointing, as shown inFIG. 13, to the eighth entry in the unused alternate-key blockallocation table. FIG. 14 illustrates the reutilization of blocks in thedatabase reorganization system that is an embodiment of the invention asit concerns alternate keys.

A description follows, with reference to FIG. 14, of a method ofreutilization. In the preferred embodiment of the invention as itconcerns alternate keys, when the need arises to acquire a newalternate-key block or alternate-key overflow block, rather thanacquiring the alternate-key block from an unused area, the unusedalternate-key block allocation table UABAT is referenced and, if anunused alternate-key block exists, alternate-key blocks registered inthe unused alternate-key block allocation table UABAT are prioritizedfor use. A method of utilizing unused alternate-key blocks is to use thealternate-key block in the entry that the start-position pointer NAAABPSis pointing to. The entries contains the addresses of unusedalternate-key blocks, so when an alternate-key block is added, theaddress is written to the alternate-key location table, and when analternate-key overflow block is added, the address is written to thepointer of the alternate-key block or alternate-key overflow block thatmanages that alternate-key block. The content of the start-positionpointer NAAABPS is then rewritten to point to the next entry in theunused alternate-key block allocation table.

FIG. 14 illustrates a state immediately following two executions of suchrewriting and the utilization of two unused blocks.

The unused alternate-key block allocation table UABAT may be used incyclical fashion. When an unused alternate-key block appears, theposition of the end-position pointer NAAABPE moves towards the end ofthe unused alternate-key block allocation table UABAT, and when anunused alternate-key block is utilized, the position of thestart-position pointer NAAABPS likewise slides towards the end of theunused alternate-key block allocation table UABAT, and so the one tablemay be used cyclically as long as the end-position pointer NAAABPE doesnot overtake the start-position pointer NAAABPS. In other words, whenthe end-position pointer NAAABPE reaches the final position of theunused alternate-key block allocation table UABAT, it is returned to thebeginning of the unused alternate-key block allocation table UABAT againand the unused alternate-key block allocation table UABAT may thus bereused.

Reorganization of Alternate-Key Tables: Database Access DuringReorganization

Next, a description follows, referencing FIG. 12, of data retrieval,reading and writing during reorganization such that it is possible toretrieve, read and write data while performing reorganization.

Retrieval with alternate key values is performed by means of binarysearch using the current alternate-key location table AALC. The targetkey value (the key value to be retrieved) is assessed as less than orgreater than the reorganization pointer RPAALC. Since entries in thecurrent alternate-key location table AALC are listed in the order oftheir primary keys, this may be achieved by comparing the key value ofthe entry that the current reorganization pointer RPAALC is pointing toand the target key value.

If the target key value is less than the low value of the alternate-keykey value in the entry that the current reorganization pointer RPAALC ispointing to, then the target entry exists upwards from (in a smalleraddress than) the current reorganization pointer RPAALC. In this casethe new alternate-key location table AALN is used to perform a binarysearch between the first address in the new alternate-key location tableAALN and the new reorganization pointer RPAALN. As the result of thebinary search, the records in the alternate-key block that the targetentry is pointing to are examined and it is determined whether thetarget record is present or not.

If the target key value is equal to or greater than the low value of thealternate-key key value of the entry that the current reorganizationpointer RPAALC is pointing to, the target entry exists downwards fromthe current reorganization pointer RPAALC (the entry RPAALC0 is pointingto or in a larger address). In this case the current alternate-keylocation table AALC is used to perform a binary search on the entriesbetween the current reorganization pointer RPAALC and the final pointerin the current alternate-key location table AALC.

Since the target entry may thus be definitively retrieved, the recordholding the target primary key value may be retrieved by searchinginside the block that that entry is pointing to.

The description foregoing is with regard to reorganization underway ofalternate-key tables. Since multiple tables are not reorganized at thesame time, reorganization will not be running on a location table whilereorganization is running on an alternate-key table. It will thereforenot be the case that two location tables, one current and one new,exist, nor will it be necessary to determine which to use, as describedfor the reorganization of a location table and blocks.

Additionally, an alternate-key block containing the entry that holds thetarget alternate-key key value cannot be accessed while positively underreorganization for reason of exclusion and is queued for release fromexclusion, but this is no different from the update, insertion ordeletion of records in normal access. In other words, requests toexcluded alternate-key blocks are queued for their release fromexclusion and may be processed once reorganization of that alternate-keyblock is complete and it is released from exclusion.

The description above concerns the retrieval of records but may also beapplied to the updating and deletion of records by means of alternatekeys. Insertions of records are performed by means of primary keys andso are not pertinent here. Care is required when executing a recorddeletion because alternate keys are non-unique. However, this may beaddressed as described below.

Reorganization of Alternate-Key Tables: Handling Advances inReorganization During Retrieval Operations

FIG. 15 illustrates operation, in the database reorganization systemthat is an embodiment of the present invention as it concerns alternatekeys, when reorganization advances during a retrieval operation using analternate key and the position of the current reorganization pointer atthe beginning of the retrieval and the position of the currentreorganization pointer at the end of the retrieval are different. It hasbeen explained how it is possible to call entries during reorganizationby using the reorganization pointer to make comparisons with the targetkey value and deciding whether to use the current alternate-key locationtable AALC or the new alternate-key location table AALN.

However, as shown in FIG. 15, if reorganization advances during aretrieval by means of an alternate key using the alternate-key locationtables AALC and AALN, the position of the current reorganization pointerRPAALC at the end of the retrieval is different from the position of thecurrent reorganization pointer RPAALC at the start of the retrieval andthe alternate-key blocks in that range had been subject to the search,there is a possibility that entries that actually exist may no longerexist because alternate-key overflow blocks are already delinked. InFIG. 15 an overflow block 15 has been delinked from a primary block 5.Left as is, this leads to unstable operation and unuseability.

Retrievals may be performed without problem by implementing thefollowing measures. As summarized in the following chart, the target keyvalue and reorganization pointer are used to determine whether thelocation table subjected to the search is the current alternate-keylocation table AALC or the new alternate-key location table AALN0. Ifthe alternate-key location table used is the new alternate-key locationtable AALN0, no problem arises even if the new reorganization pointerRPAALN has advanced from when the search started. Problems arise when itis the current alternate-key location table AALC that is used. When thecurrent alternate-key location table AALC is searched and the currentreorganization pointer RPAALC does not move, no problem arises. On theother hand, when the current alternate-key location table AALC issearched and the current reorganization pointer RPAALN moves, problemswill arise unless measures are implemented.

Access Methods When Reorganization Advances

If reorganization is underway, the value of the current reorganizationpointer RPAALC (the address of the next entry in the alternate-keylocation table to be reorganized) and the value of the newreorganization pointer RPAALN are saved before initiating retrieval.These are S-RPAALC and S-RPAALN. At the point the search in the currentalternate-key location table AALC is completed, the value of the currentreorganization pointer RPAALC at that point (which is termed“E-RPAALC0”) is compared with the value of S-RPAALC0. If these valuesare different, this means that reorganization advanced during thesearch. In this case, it is determined where the block that is theobject of retrieval is. If the determination finds that the block isbetween S-RPAALC0 and E-RPAALC0, it is possible, as discussed above,that the entry cannot be retrieved.

In this case, the new alternate-key location table AALN is used toperform a binary search between S-RPAALN0 and E-RPAALN0. Here, if theentry can be detected, the entry is reckoned to exist, and if it cannotbe detected, the entry is reckoned not to exist. This avoids thephenomenon of the non-existence of an entry that does exist.

It is seen that, as described above, entries may be read even duringreorganization.

Reorganization of Alternate-Key Tables: Entry Insertion

The insertion of an entry occasioned by the insertion of a record, asdiscussed above with respect to calling an entry, requires, in order todetermine the alternate-key block into which the entry will be inserted,performing a binary search to find the alternate-key block and theninserting the entry into that alternate-key block, which consists of thesame operations as calling an entry.

When an alternate key value in a record is modified, the location inwhich the alternate-key entry is stored must change. The reason is thatentries are stored in alternate-key blocks in the order of theiralternate key values and if an entry exists outside that range, itcannot be retrieved with the alternate-key location table. Therefore,when an alternate-key value is modified, the current entry is deletedand a new entry is then inserted in the alternate-key block identifiedon the basis of the alternate-key value as that where it should bestored. This is a method that has been in general use in conventionaldatabases.

Otherwise, alternate-key blocks may be found and written with the samemethods as are used for calling entries.

Suspending and Resuming Reorganization

It is possible, as described above, to retrieve (read), update, insert,add and delete records by means of alternate key values even while thealternate-key location table is under reorganization. In short, it goeswithout saying that records may be accessed however far reorganizationhas advanced or, in other words, from any entry in the alternate-keylocation table.

Suspending and Resuming Reorganization

It follows that record access may be executed without problem even ifreorganization is temporarily suspended. Suspension is, of course,effected after reorganization has completed on some given selection ofalternate-key location tables and alternate-key blocks.

In the database reorganization system that is a preferred embodiment ofthe present invention as it concerns alternate keys, reorganization maybe resumed with the entry in the current alternate-key location tableAALC and the new alternate-key location table AALN indicated by thecurrent reorganization pointer RPAALC and the new reorganization pointerRPAALN at the point reorganization was suspended.

Since these functions permit reorganization to be suspended andresources allocated to data processing when the load on the primarysystem increases and then resumed when the load of data processingfalls, there is no need to make advance forecasts of the load on theprimary system and operating conditions and reserve a fixed period oftime for reorganization in advance.

It has been recited that the following three kinds of entries exist inalternate-key tables. The three kinds are one that maintains blocknumbers, one that maintains block addresses and one that does notmaintain either block numbers or addresses. It has also been recitedthat these have the following characteristics.

Entries that do not maintain either block numbers or addresses allowreduction of the load and time required for reorganization, but increasethe load of retrieval because of the need to search location tablesafter retrieving the alternate-key entry in an alternate-key search.

Although entries that maintain block numbers or block addresses place alarge load on reorganization due to the need to modify information whena number or address changes during reorganization, they do result ingreater efficiency of retrieval using alternate keys.

Whichever format of alternate-key entry is selected for the creation anduse of alternate-key tables, the conditions under which they are usedmay change from the original intention. Reorganization may be necessarymore frequently than expected, for example, or conversely, the frequencyof reorganization may fall markedly below that originally planned.

Thus, it is convenient to have the capability of changing the format ofentries when reorganizing alternate-key tables and switching betweenimproved speed of reorganization at the expense of less efficientretrieval, addition, updating and deletion on the one hand and enhancedefficiency of retrieval, addition, updating and deletion at the expenseof slower reorganization on the other.

This is made possible by changing the format of alternate-key entrieswhen reorganizing alternate-key tables as follows.

When block numbers or addresses are added to alternate-key entries ofthe format that does not maintain either block numbers or blockaddresses, the volume of data per alternate-key block increases by thenumber of alternate-key entries already present in the alternate-keyblock and the volume of block numbers or block addresses.

When performing the elimination of overflow, the elimination offragmentation and the reservation of suitable initial storage rates onone or multiple alternate-key blocks in one pass, alternate-key blocksare reserved so that the increased amount of data may be written. Blocknumbers or block addresses are added to the entries in the alternate-keyblocks affected and the entry data rewritten.

For each alternate-key entry affected, the location table is searched onthe basis of the primary key value of that entry and the block number orblock address that is found is appended to the entry in the newalternate-key table.

Conversely, when block numbers or block addresses are deleted, thevolume of data per alternate-key block decreases by the number ofalternate-key entries already existing in the alternate-key block andthe volume of block numbers or block addresses.

When performing the elimination of overflow, the elimination offragmentation and the reservation of suitable initial storage rates onone or multiple alternate-key blocks in one pass, alternate-key blocksare reserved taking into account the decrease in the volume of data.Block numbers or block addresses are deleted from the entries in thealternate-key blocks affected and the entry data rewritten. This takesless time than when appending data because only the deletion of data isinvolved.

Reorganization Rewriting Blocks

The above description concerns reorganization of alternate-key locationtables, alternate-key blocks and alternate-key overflow blocks. Thisimplementation of reorganization has the benefits of holding therewriting of alternate-key blocks and alternate-key overflow blocks to aminimum and abbreviating reorganization times by rewriting the currentalternate-key location table to a new alternate-key location table.However, alternate-key blocks must be rewritten in order to change thesize of alternate-key blocks or in order to change the alternate-keyblock storage medium. Application of the system described above allowsready execution of reorganization while thus rewriting alternate-keyblocks.

The details are entirely likewise to the methods described for thepresent invention as it concerns primary keys. FIG. 19 is anillustration for the purpose of description with respect to locationtables, blocks and overflow blocks, but the logic applies in entirelylike fashion here.

Entirely likewise to the database access methods described for thepresent invention as it concerns primary keys, if the target key valueis less than the reorganization pointer, the new database is accessed,and if the target key value is greater than the reorganization pointer,the current database is accessed. The ability to suspend and resumereorganization is also entirely likewise to the present invention as itconcerns primary keys.

The current and new databases are depicted here as present on the samemachine, but they may also be present on different machines.

Exclusion Methods and Exclusion Procedures

The description following concerns exclusion methods and exclusionprocedures. “Exclusion procedures” refers to procedures for implementingexclusion, for which are proposed methods that place a low load on thesystem. “Exclusion methods” refers primarily to the sequence ofexclusion. As differing sequences of exclusion can be a cause ofdeadlock, it is important to ensure a uniform exclusion sequence in asystem. The description shows that the exclusion sequences in theseprocedures and in this system are the same.

Exclusion Procedures

The exclusion procedures are implemented by directly writing an excludedstate to each of location tables, blocks (including overflow blocks),alternate-key location tables and alternate-key blocks (includingalternate-key overflow blocks).

It is standard practice in information processing that when transactionsare generated, they are placed in a queue in the order of theirgeneration and processed in that order. It is also common for accessrequests within transactions to be expressed in the form of requestblocks. The execution of a transaction results in access to varioustypes of data. Even when they constitute access to the same type ofdata, there are differences between retrieval using location tables andretrieval using alternate-key tables.

Such information as the origin identification of the processing request,the transaction number, the data type, the processing requestidentification (read, update, addition, insertion, deletion), the readkey, the read key value and the write record is stored in the requestblock either as is or in a form indicating an address where thatinformation is held.

Two fields are added to these request blocks. One is an exclusion tableaddress, and the other is an exclusion table pointer. The exclusiontable address refers to a table storing exclusion information. Inaddition to holding addresses affected by exclusion, the entries in thistable hold address identification flags, which are described below. Theentries are of a size capable of holding the addresses. The tables havea size (their number of entries) of, for example, 100, and when thisbecomes insufficient, another 100 are added and the final address in thefirst table holds the address of the next exclusion table. Addressidentification flags are provided in order to identify whether anaddress is one affected by exclusion or that of the next table.

When an exclusive access request occurs in a request block, theaddresses affected by exclusion are placed in the exclusion table. Inthe case of a location table, this is the address of the entry affected,in the case of a block, the address of the block affected, in the caseof an alternate-key location table, the address of the entry affected,and in the case of an alternate-key block or an alternate-key overflowblock the address of the block affected.

FIG. 16 illustrates the exclusion of a location table in the databasereorganization systems that are the first and third preferredembodiments of the present invention. FIG. 16 depicts a request queue110, request blocks 120 and 121, exclusion tables 130-0, 130-1 and131-0, and a location table LC.

Entry 0 in the exclusion table 130-0 points to entry 0 in the locationtable LC. This does not mean that entry 0 in an exclusion table 50-0must always point to entry 0 in the location table LC, but merelyrepresents their relationship of correspondence. A field is added sothat the location table LC is able to hold the addresses of entries inthe exclusion tables 130-0, 130-1 and 131-1. The location table LC entry0 points to entry 0 in the exclusion table 130-0.

Whether that entry in the location table LC is placed under exclusionmay be identified by whether an address is in the exclusion-table entryaddress field of that entry. If there is an address in that field, thatentry is placed under exclusion and so may not be accessed from otherrequests.

Location table LC entry addresses are held in the exclusion tables130-0, 130-1 and 131-0 both in order to ensure the lifting of exclusionwhen a transaction has completed and in order to quickly effect thelifting of exclusion should the system suffer abnormality and ceaseoperation entirely.

While entries in the location table LC contain identification of theirexclusion status, without the exclusion tables 130-0, 130-1 and 131-0,it would be necessary, after an operational stoppage, to look throughall the entries in the location table LC and lift exclusion on thoseplaced under exclusion. On the other hand, where entries in an exclusiontables 130-0, 130-1 or 131-0 contain addresses in the location table LC,it is sufficient to look at the exclusion table 130-0, 130-1 or 131-0and lift exclusion on the relevant entries in the location table LC.

An exclusion table pointer is used to indicate through which entry theexclusion tables 130-0, 130-1 and 131-0 are used. FIG. 16 depicts astate in which an additional exclusion table 130-1 has been reserved andis used midway through. The location of the additional exclusion table130-1 is pointed to from the first exclusion table 130-0.

Exclusion Methods

The sequence of exclusion is intimately related to the occurrence ofdeadlock. If the sequence of exclusion varies with the type of access,the probability of deadlock increases and so exclusion must be effectedin an identical sequence regardless of the type of access. Sinceretrieval-type access does not require exclusion, it is here anexception. The discussion following concerns update-type (addition,insertion, update and deletion) access.

Exclusion is effected in the sequence of location tables, then blocks,(then alternate-key location tables) and then alternate-key blocks.Alternate-key location tables are placed in parentheses because somemethods uses them and some do not, but where alternate-key locationtables are used, exclusion must be performed on alternate-key locationtables. An object of exclusion is given here as, for example, locationtables, but it is only the entries in a location table, not the entirelocation, that are placed under exclusion.

When retrieving a target entry by means of binary search performed on alocation table, alternate-key tables or alternate-key location tables,entries and alternate-key blocks placed under exclusion may be at themiddle point of binary search, but their exclusion is ignored and thebinary search continues.

Exclusion Methods for Access by Primary Key

Access by primary key consists first of performing a binary search on alocation table and retrieving an entry in the location table. Once theentry is retrieved, the block that entry points to is accessed. Sinceaccess is executed in this sequence, exclusion is effected in thesequence of the location table and then blocks. Exclusion is effected onthe location table because entries in the location table may be updated.

If an overflow block is linked to a primary block, that overflow blockis simultaneously placed under exclusion. This is because when a recordis added or inserted, records may be moved and overflow blocks accessed,and also because updating may affect record length and likewise resultin access to overflow blocks. When an alternate key is modified due to arecord update, the entries in alternate-key tables concerned with thatalternate key are updated. Where alternate-key location tables are notused, this results in exclusion effected on the alternate-key blocksconcerned of alternate-key tables and on alternate-key overflow blockslinked to those alternate-key blocks, and where alternate-key keylocation tables are used, first exclusion is effected on alternate-keylocation tables and then on alternate-key blocks and alternate-keyoverflow blocks linked to those alternate-key blocks.

This is an exclusion sequence of location tables, then blocks, (thenalternate-key location tables) and then alternate-key blocks.

Exclusion Methods for Alternate Keys

The following applies where access is by alternate key.

Where alternate-key location tables are not used, first a binary searchis performed on alternate-key tables and the alternate-key blockcontaining the entry holding the target alternate-key value isretrieved. Where alternate-key location tables are used, a binary searchis performed on alternate-key location tables, the entry containing thetarget key value retrieved, and the alternate-key block entry of thealternate-key block pointed to by that entry retrieved. Thus far,retrieval may be executed without effecting exclusion. No problems atall arise from not effecting exclusion.

Once the target alternate-key block entry is found, the information inthat entry is then used to perform a search on the location table. Asrecited above, this search of the location table varies with the formatof the alternate-key block entry.

Once the location table entry that contains the target primary key isretrieved, that location table entry is placed under exclusion. Theblock and overflow blocks pointed to by that location table entry arethen placed under exclusion, and the record retrieved from those blocks.If the record is detected and the record updated, when this results inmodification of the alternate-key value, the same action is taken aswhen performing a search by primary key.

Effecting exclusion in this fashion, the exclusion sequence is one oflocation tables, then blocks, (then alternate-key location tables) andthen alternate-key blocks.

As recited with respect to alternate-key tables, where block addressesare maintained in alternate-key entries, when an alternate-key blockentry is found, the block is accessed without passing through thelocation table because the block address is maintained in that entry.Therefore, the sequence of exclusion reverses (blocks, then the locationtable). Therefore, the method of maintaining block addresses inalternate-key location table entries is inadvisable from the viewpointof deadlock.

Coordination with the Data Backup and Recovery System

An application of the reorganization of the present invention to thedata backup and recovery system (Japanese Patent 2001-094678) proposedby the inventors is described with reference to FIG. 17 and FIG. 18.

FIG. 17 is a flowchart illustrating operation in a synchronoustightly-coupled data backup and recovery system that is employed in theinvention as it concerns either primary keys or alternate keys.

FIG. 18 is a flowchart illustrating operation in an asynchronousloosely-coupled data backup and recovery system that is employed in theinvention as it concerns either primary keys or alternate keys.

As shown in FIG. 17 and FIG. 18, this data backup and recovery system iscomprised of a primary system 1 that performs the retrieval and updatingof data and a secondary system 2 that makes backups of that data. Asshown in FIG. 17 and FIG. 18, the primary system 1 is provided with abackup control mechanism 104. Additionally, as shown in FIG. 1, FIG. 2,FIG. 9 and FIG. 9, the secondary system 2 is provided with a locationtable, alternate-key tables and alternate-key location tables and,employing the same reference numerals stated above, these are discussedwith system numbers appended. The system number of the primary system is1, and the system number of the secondary system is 2.

In the system illustrated in FIG. 17, each time data is modified in theprimary system 1, a notification of modification is made to thesecondary system 2 and the data is modified in the secondary system 2.

To describe the logs employed in this system, A logs contain informationdescribing modifications to data, and B logs are maintained in order torestore the data to its original state when a transaction is canceled.Another purpose of B logs is to restore the data to some given paststate. T logs are logs of transaction data. When data is incorrectlyupdated due to programming error, B logs may be used to restore the datato its state immediately prior to the program abnormality, and after theprogram is replaced with a properly-function one, T logs may be used tonormalize the data. It is further possible to provide more than onesecondary system 2, as needed.

Two implementations of the data backup and recovery system describedabove are recited, a synchronous tightly-coupled system in which data isupdated to synchronize with the primary system 1 when A logs aretransmitted to the secondary system 2 and an asynchronousloosely-coupled system in which data updates are delayed in theunsynchronized secondary system 2.

Key Issues in Adoption of Synchronous Tightly-Coupled Systems andAsynchronous Loosely-Coupled Systems

A description follows of the key issues involved in the adoption ofsynchronous tightly-coupled systems and asynchronous loosely-coupledsystems. As is well-known, the speed of light is approximately 300,000km/sec. Calculating data-transmission time from the speed of light, wefind as follows. The time required for transmission over a distance of300 meters is 1 micro seconds, for transmission over a distance of 30meters 100 nanoseconds and for transmission over a distance of threemeters 10 nanoseconds.

The following calculations also hold true for data transmission speeds.On equipment with a data transmission speed of one gigabit/sec, the timerequired to transmit one kilobyte of data on 1 gigabit/sec=100megabytes/sec is 10 micro seconds.

A synchronous tightly-coupled implementation requires both atransmission delay proportional to distance and a transmission timeproportional to data volume. Therefore, adoption of a synchronoustightly-coupled implementation must be premised on these timerequirements.

Reorganization as Transaction

As recited above, the reorganization of the present invention isperformed on from one to several tens of blocks, alternate-key blocks orthe like in any given reorganization pass, and the reorganization istreated such that it appears to be a regular data processingtransaction.

In the data backup and recovery system (Japanese Patent 2001-094678,domestic priority claimed), backups are performed in transaction unitsand the scope of recovery is also determined according to whethertransactions have completed or not.

The reorganization of the present invention is performed as a singletransaction so as not to interfere with the data backup and recoverysystem recited above.

A Log Reorganization System Advantageous in Synchronous Tightly-CoupledSystems

The description concerns itself first with an A log reorganizationsystem that is beneficial in a synchronous tightly-coupledimplementation. The reorganization of the present invention is performedby reorganizing the primary system 1 and secondary systems 2simultaneously. When executing reorganization on the location tableentry 123 on the primary system 1, for example, reorganization of thesecondary location table entry 123 is performed on all secondary systems2. The secondary systems 2 in this case use the A log transmitted fromthe primary system 1 to update the data on secondary systems 2.

The description first concerns itself with the reorganization oflocation tables and blocks in an application to the data backup andrecovery system of the database reorganization system that is apreferred embodiment of the present invention as it concerns primarykeys.

As the reorganization system has already been described in detail and isimplemented likewise on secondary systems 2, the description followingemphasizes coordination of the primary system 1 and secondary systems 2.While this description of the present invention is for a singlesecondary system 2, it is implemented in like fashion on multiplesecondary systems 2. The system number of secondary systems is 2.

Next, drawing 3 and drawing 17, used in the foregoing description, areused below to describe the operation of the primary system and thesecondary system.

In FIG. 3 and FIG. 17, in order to perform reorganization, a log fromthe primary system 1 is required notifying the secondary system 2 tothat effect. This is termed “RS information”. This is information tostart reorganization and, in addition to information on which key(location table or alternate-key tables) to effect reorganization on, italso includes such information as suitable initial storage rates. Inthis case, the object of reorganization is a location table.

In FIG. 17 processing of reorganization transaction 1 begins (S301 inFIG. 17), and first a transmission of RS information is made (S302 inFIG. 17) from the primary system 1 to the secondary system 2.

After receiving (S401 in FIG. 17) the RS information, the secondarysystem 2 creates reorganization pointers RPLC2 and RPLN2, and a regionis reserved (S402 in FIG. 17) for a new location table LN2. The initialvalue of the reorganization pointer RPLC2 is the first address in thenew location table LC2, and the initial value of the reorganizationpointer RPLN2 is the first address in the new location table LN2. Whenthis has been done, the secondary system 2 transmits (S403 in FIG. 17)RS-ACK2 information to the primary system 1.

After receiving (S303 in FIG. 17) the RS-ACK2 information, the primarysystem 1 begins actual reorganization work.

The above series of operations constitutes a transaction on the primarysystem 1 and the secondary system 2. Although this is reorganizationtransaction 1 and may be distinguished from other transactions in FIG.17, there is in fact no need to classify types of transactions and itmay be treated in like fashion as a regular transaction.

In FIG. 17, next the processing of a regular transaction 2 is executed(S304 in FIG. 17) on the primary system 1. Next, a reorganizationtransaction 3 is executed (S305 in FIG. 17) on the primary system 1. Onprimary system 1, as shown in FIG. 3, location tables LC1 and LN1 and,as necessary, blocks 10 are referenced and the first reorganizationrange determined. For example, assume block number 0 of the blocks 10,block 1 of the blocks 10, block number 1-2 of its overflow blocks 13 andblock number 1-3 of its overflow blocks 14 are objects of thereorganization. In this case, notification is given (S306 in FIG. 17)from the primary system 1 to the secondary system 2 in RSES informationto exclude block numbers 0 and 1 of the blocks 10 and block numbers 1-2and 1-3 of the overflow blocks 13 and 14 and to execute reorganizationwith these blocks as the objects of the first reorganization, and onprimary system 1 exclusion of block numbers 0 and 1 of the blocks 10 andblock numbers 1-2 and 1-3 of the overflow blocks 13 and 14 is executed.

Once the RSES is received (S404 in FIG. 17), except in special caseswhere reorganization cannot be executed, exclusion of block numbers 0and 1 of the blocks and block numbers 1-2 and 1-3 of the overflow blocksis executed (S405 in FIG. 17) immediately on the secondary system 2. Thesecondary system 2 then transmits (406 in FIG. 17) to the primary system1 RSES information, which is notification that the exclusion is done.

Reorganization of block numbers 0, 1, 1-2 and 1-3 of the blocks 10 isthen executed on the primary system 1. As a result, changes occur inlocation table LN entries and in block numbers 0 and 1 of the blocks 10and block numbers 1-2 and 1-3 of the overflow blocks 13 and 14, andthese changes are transmitted (S307 through S311 in FIG. 17) to thesecondary system 2 in the form of an A log. In logical terms, this isthe same as an A log involving regular record updates, and so once the Alog has been received, that A log is applied (S407 through S410 in FIG.17) to the corresponding entries and blocks on the secondary system 2.

Application of the A log is as follows. The A log gives notification ofblock numbers to identify which entries in the location table and whichparts of blocks are at issue. Where an overflow block is linked to aprimary block, one method that may be employed for the notificationgiven of block numbers if operations are to be performed on thesecollectively as one object is to transmit as the A log post-updateinformation of the whole of the primary block and any overflow blockslinked to it, but since this involves the transmission of large volumesof data, the volume of data transmitted may be reduced by transmittingit with identification numbers indicating which overflow block numberthe log concerns. Another way is, reckoning the primary block and anyoverflow blocks linked to it as a whole, to extract only the portionupdated and transmit it in a format providing offset, length andpost-update data.

Logic Transmission

The following implementation is also advantageous. Reorganizationentails changes to blocks as a whole, but the content of records doesnot change. In the elimination of fragmentation and the reservation ofsuitable initial storage rates in particular, there is much movement ofrecords within blocks and between blocks, and in such cases the volumeof data transmitted may be greatly reduced by the movement logic itselfas an L (logic) log. An example of such logic would be the 1500-byterightward movement of all the records within an overflow block and thenthe movement to the overflow block of the 91st and subsequent recordswithin a primary block.

In the secondary system 2, that logic is applied to manipulate (S405 inFIG. 17) those blocks on the basis of this L log.

To consider the volume of a transmission, assuming blocks of a size of16 kilobytes and given one hundred 150-byte records internal to aprimary block 1 and eighty likewise 150-byte records in an overflowblock, transmission as is of post-update information in an A log wouldrequire the transmission of block information in addition to thetransmission of 150×90=13,500 bytes.

On the other hand, a transmission of the logic involved would amount toa transmission volume of 1,000 bytes or less. Further, a logic formatthat may readily be executed by an interpreter is preferable, since itis troublesome to execute what is required by a compiler.

Additionally, while it goes without saying, that this logic, consistssolely of that required to apply this portion of the reorganization atthis point in time and, that application having been completed, may bediscarded.

Application of Logic Transmissions to the Data Backup and RecoverySystem

This logic transmission may be applied to the data backup and recoverysystem. The data backup and recovery system employs a method oftransmitting to secondary systems either the content of modified blocksitself or those portions that are modified, but when a record isinserted, multiple records are moved within a block. Since the recordsthemselves are not modified, an L log is transmitted and the records inthose blocks are manipulated in the secondary system on the basis of thelogic transmitted.

As recited with respect to the reorganization of location tables andblocks, when blocks are reconfigured, alternate-key tables may,depending on the format of alternate-key table entries, be updated andso these are likewise transmitted as A logs.

When performing reorganization, information transmitted as A logs mustbe identified as for the current new location table LC1 or the newlocation table LN1. This is because not doing so would result inmistaken updates. In order to prevent this, A logs containidentification of whether they apply to the current location table LC2or the new location table LN2.

Mistaken updates may thus be prevented. An A log must also includereorganization pointers. Reorganization pointers are important inunderstanding how far reorganization has progressed, but this is alsobecause, where a secondary system 2 is used as a reference system, itwill not operate without reorganization pointers.

Block numbers 0, 1, 1-2 and 1-3 of the blocks 10 are reconfigured (S304in FIG. 17) on the primary system 1, and once the blocks are done, themodified content of the blocks and the location table is transmitted(S308 thru S3311 in FIG. 17) is transmitted to the secondary system asan A log. When the secondary system 2 receives (S407 thru S3419 in FIG.17) the A log, it rewrites (S405 in FIG. 17) the new location table LN2and the blocks in accordance with that information.

When reorganization of block numbers 0 and 1 of the blocks 10 and blocknumbers 1-2 and 1-3 of the overflow blocks 13 and 14 is done on theprimary system 1, exclusion is lifted on those blocks and RSEEinformation is transmitted (S312 in FIG. 17) to the secondary system.After the RSEE information is received (S411 in FIG. 17) and blocknumbers 0 and 1 of the blocks and block numbers 1-2 and 1-3 of theoverflow blocks have been processed, exclusion is lifted on those blocksin the secondary system 2 and RSEE-ACK2 information is transmitted (S412in FIG. 17) to the primary system 1.

Thus, as reorganization proceeds on the primary system 1, it is possibleto perform reorganization simultaneously on the secondary system 2,synchronizing the primary system 1 and the secondary system 2, bytransmitting A logs to the secondary system 2 and immediately updatingthose blocks or other entities on the secondary system 2.

Reorganization is executed sequentially and, once reorganization is donethrough the entry immediately prior to that pointed by the final pointerof the current location table LC, the reorganization of the locationtable and blocks as a whole is done. Next, the description concernsitself with the application of the second and third preferredembodiments of the present invention to the data backup and recoverysystem.

First, the description concerns itself, with reference to FIG. 18, withthe reorganization of alternate-key tables. As recited for thedescription of the reorganization of location tables and blocks, in thereorganization of alternate-key tables, the alternate-key blocks andalternate-key overflow blocks affected by reorganization in a pass aredetermined, and that information is transmitted from the primary system1 to the secondary system 2. Then the data updated in reorganization istransmitted as an A log from the primary system 1 to the secondarysystem 2, and on the basis of that information the pertinentalternate-key blocks and, where alternate-key location tables are used,alternate-key location tables are updated on the secondary system 2.

The volume of transmissions may be reduced in the case of alternate-keytable reorganization by transmitting L logs instead of A logs.

Advantageous Employment of Parallel Reorganization in the AsynchronousLoosely-Coupled System

Next, the description addresses asynchronous loosely-coupled systems.The method of updates by means of A logs may be implemented withasynchronous loosely-coupled systems as well, but there is a highprobability that the update of the secondary system may be delayed dueto actual delay resulting from the distance along the path oftransmission or to delay in transmission of A logs over the path oftransmission resulting from their volume.

The description of the benefits of parallel reorganization to anasynchronous loosely-coupled system makes reference to FIG. 18. As shownin FIG. 18, the primary system 1 and the secondary system 2 performreorganization simultaneously in order to implement the reorganizationof the present invention. Given the execution of reorganization on entry123 in the location table in the primary system 1, for example,reorganization would be effected on the entry 123 in the secondarylocation tables in all secondary systems 2. In other words, while theprimary system 1 and the secondary system 2 are discrete systems, theyperform exactly the same operations. Where there are multiple secondarysystems 2, all those secondary systems 2 execute reorganization at thesame time as the primary system 1.

In order to perform reorganization, a log is required from the primarysystem 1 notifying the secondary system 2 to that effect. This is RSinformation. This RS information is information for startingreorganization and includes such information as suitable initial storagerates in addition to information on which keys (location table oralternate-key tables) are the object of reorganization. In this casereorganization is performed on the location table. The primary system 1transmits (S501 in FIG. 18) this RS information to the secondary system2.

After receiving (S601 in FIG. 18) the RS information, the secondarysystem 2 creates the reorganization pointers RPL2 and RPLN2 and reservesspace for the new location table LN2. (S602 in FIG. 18) Once this isdone on the secondary system 2, RS-ACK2 information is transmitted (S603in FIG. 18) to the primary system 1.

After receiving (S502 in FIG. 18) the RS-ACK2 information, the primarysystem 1 begins to perform the actual reorganization operations.

The primary system 1 references the location table and, as required,blocks to determine the first range of reorganization. Take the exampleof block numbers 0 and 1 of the blocks and block numbers 1-2 and 1-3 ofthe overflow blocks on the primary system 1. Notification is made (S503in FIG. 18) from the primary system 1 to the secondary system 2 of RSESinformation in an RSES log to place block numbers 01 of the blocks andblock numbers 1-2 and 1-3 of the overflow blocks under exclusion and toexecute reorganization with those blocks as the object of the firstreorganization.

Once the RSES is received (S604 in FIG. 18), except in special caseswhere reorganization cannot be executed, the secondary system 2transmits (S605 in FIG. 18) RSES-ACK2 to the primary system. Then, thesecondary system 2 executes (S606 in FIG. 18) the elimination ofoverflow blocks, the elimination of fragmentation and the reservation ofsuitable initial storage rates with the same logic as the primary system1. And when the primary system 1 has lifted exclusion, it transmits(S504 in FIG. 18) RSEE information to the secondary system 2 asnotification that exclusion is lifted.

On the secondary system 2 the block numbers 0 and 1 of the blocks andthe block numbers 1-2 and 1-3 of the overflow blocks linked to blocknumber 1 are reorganized (S606 in FIG. 18), and when that is done,RSEE-ACK2 information is transmitted (S607 in FIG. 18) is transmitted tothe primary system 1 as notification that exclusion has been lifted.

Since the primary system 1 and the secondary system 2 are notsynchronized, an instruction for reorganization of the next block may betransmitted to the secondary system even if the reorganization of blocknumbers 0 and 1 of the blocks and block numbers 1-2 and 1-3 of theoverflow blocks linked to block number 1 is not done on the secondarysystem 2.

Next, the description addresses the reorganization of alternate-keytables. In the reorganization of alternate-key tables, as recited forthe reorganization of location tables and blocks, the alternate-keyblocks and alternate-key overflow blocks that are the objects of onepass of reorganization are determined, and that information istransmitted from the primary system to the secondary system.Reorganization is then performed on the secondary system, and whenreorganization of the alternate-key blocks that are objects of the passof reorganization is done, RSEE information is transmitted from thesecondary system to the primary system.

Recovery Procedure

Recovery should be executed from a synchronous tightly-coupled secondarysystem. The reason is that recovery may take a long time in anasynchronous loosely-coupled system because its components arephysically distant and its transmission path lacks sufficient capacity.When a recovery request occurs, first a check is made whether thecurrent transaction has completed on the primary system. If it hascompleted on the primary system, that backup transaction is allowed tocomplete on the secondary system. If the transaction is uncompleted andhas been canceled on the primary system, the B log concerning the backuptransaction is used to restore the data on the secondary system to itsstate prior to execution of the transaction. Then, some or all of thelocation table entries, blocks (primary blocks and overflow blocks),alternate-key location table entries, alternate-key blocks oralternate-key overflow blocks requested by the primary system aretransmitted from the secondary system to the primary system.

The data on the primary system is restored on the basis of theinformation transmitted from the secondary system. The restoration ofthe data differs between hardware failure and simple missing data.

If a hardware failure, it is not possible to write the data back to thelocation of the original data on the primary system, and so the spacefor recovery is secured in a new storage region. For example, where thestorage region of a part of the entries of a location table has beendestroyed, that portion alone may be newly reserved and used as adiscontiguous location table or a new region may be reserved for theentire location table and the location table placed in a contiguousregion. Although location tables hold block addresses, when a recoveryis not performed on the blocks themselves, there will be no modificationof entry information.

If the blocks, a region for the blocks may be reserved and the blockswritten to on the basis of the recovery information transmitted from thesecondary system. However, since the addresses of those blocks will bemodified in this case, the block addresses in the affected entries ofthe location table must be rewritten. Further, where alternate-key tableentries hold block addresses, it will be necessary to rewrite thosealternate-key entries and the recovery will take considerable time.Therefore, this aspect too must be given consideration in the selectionof a format for alternate-key entries.

Notes

A database system may also be implemented as follows. A data anddatabase system on computers may be characterized by capability forstorage of multiple entries each comprising an alternate key, the blocknumber storing the applicable record and the primary key of theapplicable record, the use of alternate-key blocks that may be reservedcontiguously in advance in an identical size and in their requisitenumber, the use of alternate-key location tables that place the numbersassigned to those alternate-key blocks in correspondence with theirphysical location in storage devices in order to manage the locations ofthose alternate-key blocks in those storage devices, the storage ofalternate-key entries in the key blocks in the order of their alternatekeys, the allocation of new alternate-key overflow blocks and thestorage in them of alternate-key entries when the alternate-key blocksare unable to store further entries, and the management of the locationof the alternate-key blocks in the storage devices by means ofalternate-key location tables, and when data is retrieved by analternate key, the entry containing the target alternate-key value maybe found by searching the alternate-key location tables, the blockstoring the target record may be known from that entry and the recordmay be retrieved from that block.

Disallowing Synchronization of Reorganization in a Secondary System

Coordination with the backup and recovery system recited above is ofreorganization executed on a synchronized primary system and secondarysystem.

Reorganization may also be executed on an unsynchronized primary systemand secondary system. A description follows.

In the backup and recovery system, it is known by means of thetransmission of A logs from the primary system to the secondary systemwhich records have been modified in what manner, and data on thesecondary system is modified on the basis of the A logs. Also, in thebackup and recovery system, “In addition to the content of thepost-update data (A log), the nature of the update (differentiatingamong updates, additions and deletions) and file identification, themessage should also include the number of the block where the data isstored and the leading address of the record in the block. Thetransmission of this data speeds up the reading of the locations writtento on the secondary system 2.”

Since records include primary keys, it is possible to identify whichrecords are affected if the content of the records is known, but blocknumbers and offsets within blocks should be included for purposes ofacceleration.

If only record content and the nature of the update are transmitted asthe A log here, it is not possible to readily determine on the secondarysystem which blocks are affected, and it is necessary perform a binarysearch of the location table, in like fashion to regular access, to findthe blocks affected. This means that on the secondary system the storagelocation of records modified in backup may be known by means ofperforming a binary search on the location table.

And in the reorganization recited above, the use of reorganizationpointers permits determination of whether to use the current locationtable or the new location table, and this allows the definitiveretrieval of records and other access regardless of how far in thelocation table reorganization has advanced.

Therefore, it is possible to obtain a backup even where the primarysystem and the secondary system are not at all synchronized.

A more specific description makes reference to FIG. 21. FIG. 21illustrates a state in which, in the primary system, reorganization hascompleted through entry 3 in the current location table and throughentry 6 in the new location table. On the other hand, it illustrates astate in which, in the secondary system, reorganization has completedthrough entry 1 in the current location table and through entry 3 in thenew location table. It is known that in this case backup may be executedwith no problems on the blocks managed in the current location tableentries through entry 1 and from entry 4 onward. However, althoughreorganization of entries 2 and 3 (actually entries 4, 5 and 6 in thenew location table) in the current location table has been completed onthe primary system, their backup on the secondary system has not beencompleted.

Assume modification of a record stored in block 3 in thesecircumstances. On the primary system this block is managed by entry 6 inthe new location table and its new block number is 6. Meanwhile, on thesecondary system it is managed in the current location table and itsblock number remains 3. Since the block numbers do not correspond inthis case, transmission of the block number is meaningless. The recordcontent and segment modified are transmitted from the primary system. Inthis case, the record is updated. On the secondary system, a binarysearch is performed on the location table on the basis of the primarykey contained in the record. Since it is greater than the reorganizationpointer in this case, the binary search is performed on the currentlocation table. The record is then determined to be one in block 3, andthat record in block 3 is updated. Thus, a backup may definitively beperformed by determining which location table to use depending onwhether the primary key is greater than or less than the reorganizationpointer on the secondary system and performing a binary search.

Although the description above applies to location tables and blocks, itis possible in entirely likewise fashion to obtain backups ofalternate-key location tables and alternate-key blocks withoutsynchronizing the primary system and the secondary system. To expand, itis possible to reorganize a primary system and to perform reorganizationon a secondary system at a different time without reorganizing thesecondary system at the same time.

The advantage of this system is that since it is not necessary tosynchronize the primary system and the secondary system, the overheadrequired for scheduling may be eliminated. It may also be mentioned itwould be a simple matter to perform recovery when the primary systemsuffers a fault during the rewriting of a block on the primary system.

Next, with respect to recovery, where backups are obtained withoutsynchronizing the primary system and the secondary system, as describedabove, it is readily imagined that the structure of blocks andalternate-key blocks may be different on the primary system and thesecondary system.

Even in such circumstances, recovery is possible. Where a specific rangeof a location table is lost, the range of primary key values in the lostsection of the location table may be known by reading the surroundingsections of the location table. Recovery is possible by reading thelocation table and blocks in that range from the secondary system andreproducing them on the primary system. This applies likewise toalternate-key location tables.

Where a specific range of blocks is lost, entirely likewise to theabove, recovery is possible by specifying the range lost, reading thelocation table and blocks in that range on the secondary system andreproducing them on the primary system. This applies likewise toalternate-key location tables.

Advantages of the Invention

Accordingly, several objects and advantages of the present invention areas follows. Owing to the provision of a current location table and a newlocation table, the processing of one or multiple blocks in a pass andthe sequential transfer of entries from the time current location tableto the new location table, reorganization may be effected withoutsuspension of operation of the system.

As only the space for the new location table is required when effectingreorganization, reorganization is possible with a far smaller regionthan has been required in conventional systems.

The three objects of the elimination of overflow blocks, the eliminationof fragmentation and the reservation of suitable initial storage ratesmay be accomplished at once in reorganization.

Owing to the provision of a reorganization pointer to each of thecurrent location table and the new location table and the storage in thereorganization pointers of the locations where sequential reorganizationof single or multiple blocks terminated in a given reorganization pass,a target record may be definitively retrieved even during reorganizationbecause, when a record is retrieved during reorganization by means ofits primary key, the target primary-key value is compared with theprimary-key value of the record contained in the primary block oroverflow block of the entry pointed to by the reorganization pointersand the current location table is used to retrieve the target recordwhen the target primary-key value is found to be greater than or equalto the primary-key value of the record stored in the block pointed to bythe reorganization pointers and the new location table used to retrievethe target record when it is found to be less than that primary-keyvalue. Records may likewise be updated, added and deleted.

Due to the use of reorganization pointers, reorganization may besuspended and resumed at any time. It is therefore possible to performreorganization without scheduling time for reorganization.

Owing to the provision of a reorganization pointer to each of thecurrent alternate-key location table and the current alternate-keylocation table and the storage in the reorganization pointers of thelocations where sequential reorganization of single or multiple blocksterminated in a given reorganization pass, a target record may bedefinitively retrieved even during reorganization because, when a recordis retrieved during reorganization by means of an alternate key, thetarget alternate-key value is compared with the alternate-key value ofthe entry contained in the alternate-key block of the entry pointed toby the reorganization pointers and the current alternate-key locationtable is used to retrieve the target entry when the target alternate-keyvalue is found to be greater than or equal to the alternate-key value ofthe record entry in the alternate-key block pointed to by thereorganization pointers and the current alternate-key location tableused to retrieve the target entry when it is found to be less than thealternate-key value of that entry.

Owing to the provision of a reorganization pointer to each of thecurrent alternate-key table and the new alternate-key table and thestorage in the reorganization pointers of the location through whichreorganization is done in a given reorganization pass, a target recordmay be definitively retrieved even during reorganization because, when arecord is retrieved during reorganization by means of an alternate keyundergoing reorganization, the target alternate-key value is comparedwith the alternate-key value of the entry contained in the alternate-keyblock of the entry pointed to by the reorganization pointers and thecurrent alternate-key location table is used to retrieve the targetrecord when the target alternate-key value is found to be greater thanor equal to the alternate-key value of the entry stored in thealternate-key block pointed to by the reorganization pointers and thenew alternate-key location table used to retrieve the target record whenit is found to be less than the alternate-key value of that entry.Records may likewise be updated, added and deleted.

Owing to the ability to execute reorganization as a transaction like tothat of the updating, addition or deletion of a record and owing to theconsistency of the sequence of exclusion of location tables, blocks,alternate-key location tables and alternate-key blocks, the system isnot susceptible to deadlock.

The integrity of data maintenance is assured because the data backup andrecovery recited in Japanese Patent for a data backup and recoverysystem may be effected together with the reorganization of the presentinvention.

1. (canceled)
 2. (canceled)
 3. (canceled)
 4. (canceled)
 5. (canceled) 6.(canceled)
 7. (canceled)
 8. (canceled)
 9. (canceled)
 10. (canceled) 11.(canceled)
 12. (canceled)
 13. (canceled)
 14. (canceled)
 15. (canceled)16. A database reorganization system, comprising: data records forholding data entries, each data record contain a primary key; primaryblocks for storing data records in the order of the primary keysthereof; overflow blocks linked to the primary blocks; a currentlocation table and a new location table for containing in contiguousregions entries describing the addresses of the primary blocks; areorganization pointer for current location table; a final pointer forthe current location table; and a reorganization pointer for the newlocation table.
 17. The database reorganization system of claim 16,wherein the database recognition system is configured to sequentiallywrite entries in the current location table to the new location tableand, where any overflow block is present, to delink said overflowblocks, creating new entries corresponding to the primary blocks andadding the new entries to the new location table.
 18. The databasereorganization system of claim 16, further comprising: a first meansfor, upon receipt of a database reorganization command, creating a newlocation table in addition to the current location table; and a secondmeans for sequentially writing entries in the current location table tothe new location table and, when an overflow blocks linked to a primaryblock is detected, delinking that overflow blocks, adding new entries tothe new location table, and rendering the overflow blocks as new primaryblocks.
 19. The database reorganization system of claim 16, furthercomprising: means for shifting fore and aft records in primary blocksand eliminating fragmentation when a storage rate in primary blocksfalls outside a range of predetermined values; and means forsequentially writing entries in the current location table to the newlocation table.
 20. The database reorganization system of claim 16,further comprising: means for sequentially writing entries in thecurrent location table to the new location table and maintaining in theprimary blocks the initial storage rates used in the primary blocks. 21.The database reorganization system of claim 16, further comprising: acomparative means for, when retrieving a record by the primary keyduring reorganization, comparing the value of the target primary keywith the value of the primary key of the record contained in the primaryblock and the overflow blocks of the entry indicated by at least one ofthe reorganization pointers; and a retrieval means for using the currentlocation table to retrieve the target record when the value of thetarget primary key is found by the comparative means to be greater thanor equal to the value of the primary key of the record stored in theblocks indicated by at least one of said reorganization pointers and forusing the new location table to retrieve the target record when it isfound to be less than the value of the primary key.
 22. A databasereorganization system, comprising: data records for holding datacontaining primary keys and alternate keys; alternate-key entries thathold data entries, each alternate-key entry comprises an alternate keyand a primary key; alternate-key blocks for containing the alternate-keyentries; alternate-key overflow blocks linked to the alternate-keyblocks; a current alternate-key location table and new alternate-keylocation tables for containing alternate-key location table entries incontiguous regions; a reorganization pointer for current alternate-keylocation table which indicates a progress of recognition of thealternate-key location table and alternate-key blocks for the currentalternate-key location tables; a final pointer which indicates a finalpoint of the most currently used entry of the alternate-key locationtable for the current alternate-key location tables; and areorganization pointer for the new alternate-key location table.
 23. Thedatabase reorganization system of claim 22, further comprising: meansfor sequentially writing entries in current alternate-key locationtables to a new alternate-key location table and, where an alternate-keyoverflow blocks exists, delinking the alternate-key overflow blocks,creating new alternate-key location table entries corresponding to thealternate-key blocks and adding new alternate-key location table entriesto a new alternate-key location table.
 24. The database reorganizationsystem of claim 22, further comprising: a first means for, upon receiptof a database reorganization command, creating a new alternate-keylocation table in addition to the current alternate-key location table;and a second means for sequentially writing entries in the currentalternate-key location table to the new alternate-key location tableand, when alternate-key overflow block linked to alternate-key block isdetected, delinking that alternate-key overflow block, adding newalternate-key location table entries to new alternate-key location tableand rendering these as new alternate-key blocks.
 25. The databasereorganization system of claim 22, further comprising: means forshifting fore and aft records in the alternate-key blocks andeliminating fragmentation when the storage rate in the alternate-keyblocks falls outside a range of the specified values; and means forsequentially writing entries in the current alternate-key location tableto new alternate-key location table.
 26. The database reorganizationsystems of claims 22, 23 or 24, further comprising: a comparative meansfor, when retrieving a record by the alternate key duringreorganization, comparing the value of the target alternate key with thevalue of the alternate key of the record contained in the alternate-keyblock of the entry indicated by at least one of said reorganizationpointer; and a retrieval means for using the current alternate-keylocation table to retrieve the target record when the value of thetarget alternate key is found by the comparative means to be greaterthan or equal to the value of the alternate key of the record stored inthe alternate-key blocks indicated by at least one of the reorganizationpointer and for using the new alternate-key location table to retrievethe target record when it is found to be less than the value of thatalternate key.
 27. A database system, comprising: data records forholding data entries, each date record may contain primary keys and zeroor one or more alternate key; primary blocks for storing data records inthe order of the primary keys thereof; alternate-key entries that holdsdata entries, each alternate key entries comprises an alternate key anda primary key; alternate-key blocks for containing the alternate-keyentries; a current alternate-key location table for containingalternate-key location table entries in contiguous regions; and meansfor storage of the alternate-key entries in the alternate-key blocks inthe order of their alternate keys and, when no further entries may bestored in the alternate-key block, linkage of a new alternate-keyoverflow block to that alternate-key block and storage in thatalternate-key overflow block of alternate-key entries that cannot bestored in the alternate-key block.
 28. The database reorganizationsystem of claim 16, further comprising: means for shifting fore and aftrecords in primary blocks and eliminating fragmentation when the storagerate in primary blocks falls outside a range of specified values;contiguous regions joined for storage of the addresses of unused blocksresulting from the elimination of fragmentation; and pointers whichindicates the start points and end points of those contiguous regions.29. A database reorganization system, comprising: data records forholding data entries, each data record contains a primary key; primaryblocks for storing data records in the order of the primary keysthereof; overflow blocks linked to the primary blocks; a currentlocation table for containing in a contiguous region entries describingthe addresses of the primary blocks; a first means for, upon receipt ofa database reorganization command, creating a new location table inaddition to the current location table; and a second means forsequentially writing entries in the current location table to the newlocation table and, when an overflow block is detected, delinking thatoverflow block, adding new entries to the new location table andrendering these as new primary blocks in the new location table; and athird means for writing current database blocks as primary blocks in thenew location table.
 30. A database reorganization system, comprising:data records for holding data entries, each data record may contain aprimary key; primary blocks for storing data records in the order of theprimary key thereof; a first means, in a backup database reorganizationsystem having a current location table containing in a contiguous regionentries describing the addresses of the primary blocks, for creating,upon receipt of a database reorganization command, a new location tablein addition to the current location table; and a second means, in thatbackup database reorganization system, for sequentially writing primaryblock entries in the current location table to the new location tableand, when an overflow block linked to a primary block is detected,delinking the overflow block, adding new entries to the new locationtable and rendering these as new primary blocks.